Links (May 2023)

Part Of: Links sequence

  • Constitutions (Copilot, GPT-4) are being used to improve model alignment, above and beyond RLHF. Whitepaper and analysis
  • A detail-oriented prompt engineering guide, and one with an interesting history section). 
  • Why is the prompt “let’s think step by step” so effective? Andrej Karpathy suggests “models need tokens to think”: since each token requires a similar amount of compute, harder problems require longer reasoning traces. More generally, prompts can (hackishly) approximate a kind of System 2 reflection. An interesting framework for interpreting recent innovations like tree of thoughts.
  • Certain academics (e.g., Yann LeCun) like to focus on architecture design. But the scaling hypothesis predicts AGI will come simply with more data and compute. “OpenAI, lacking anything like DeepMind’s resources, is making a startup-like bet that they know an important truth which is a secret: the scaling hypothesis is true!” That’s why they got there first: the courage of their convictions.
  • We have no moat: an interesting (albeit controversial) discussion of open-source vs closed-source AI development.
  • Emergence. A tabulation of 137 emergent abilities in LLMs. An explainer, “Somewhat mysteriously, all of these abilities emerge at a similar scale despite being relatively unrelated.” But see the misaligned evaluation metrics section for an important criticism.
  • Alien math

Leave a comment