Topic

Scaling Laws

Dario Amodei traces his belief in scaling laws back to 2019 at OpenAI, when GPT-2 offered the first real evidence that feeding models more data and compute produced predictable gains in intelligence. He describes it like a chemical reaction: data, compute, and model size are the ingredients, intelligence is the product. That conviction was a founding reason for Anthropic, and he says it took years to convince OpenAI leadership it mattered (though he wryly notes they came around: "we succeeded"). Sam Altman now fully agrees, calling the story of recent progress simply "better algorithms, bigger computers, more data" and admitting he wishes he had something deeper to say. OpenAI's bet on custom chips with Broadcom and plans for 10 gigawatts of data centers are a direct expression of this faith. Greg Brockman adds historical context: when OpenAI started in 2015, the team believed AGI was mainly about ideas, not compute. By 2017 they had changed their minds. Jack Clark, meanwhile, describes late-night calls with Dario where they acknowledge the scaling laws keep delivering, and warns that each generation of models shows increasing situational awareness on internal evals. "The pile of clothes on the chair in our bedroom is starting to move," he said at a November 2025 talk, noting that frontier labs are collectively betting fractions of US GDP that scaling will continue to work.

The skeptics (or at least the qualified believers) tell a more interesting story. Demis Hassabis acknowledges scaling laws are "going very well" but is the most candid about their limits. He calls current systems "jagged intelligences" that excel in narrow domains but can't learn continuously, can't generate truly original hypotheses, and fail at relatively simple tasks when prompted the wrong way. He suspects one or two major innovations are still needed beyond pure scaling to reach AGI, and is betting on world models (DeepMind's Genie project) as part of the answer, arguing that LLMs understand language but not reality. Ilya Sutskever offers the sharpest internal critique of the current scaling approach. He points to a puzzling disconnect: models score impressively on evals yet struggle with basic coding bugs in practice, sometimes oscillating between the same two errors. His explanation is that RL training, guided by researchers who unconsciously optimize for eval performance, produces a form of human-mediated reward hacking. He illustrates with an analogy: a student who grinds 10,000 hours on competitive programming problems will ace the contest but may lack the taste and judgment to be a good engineer, while the naturally talented student who practiced for 100 hours will likely have the better career. Mira Murati strikes a middle position, carefully noting that scaling laws are "not literally a law, but an observation" and that she doesn't see strong evidence against continued progress, while allowing that new ideas and architectures may be needed. Daniela Amodei frames scaling as something Anthropic believes in but can't fully predict, noting they couldn't have said in advance when models would cross particular capability thresholds (like agentic abilities appearing seemingly overnight between December 2025 and January 2026).

People on this topic

Dario Amodei Anthropic Daniela Amodei Anthropic Jack Clark Anthropic Sam Altman OpenAI Greg Brockman OpenAI Ilya Sutskever SSI Mira Murati Thinking Machines Lab Demis Hassabis Google DeepMind

Perspectives

Sutskever vs. the Scaling Consensus

Ilya Sutskever arguably proved scaling laws work better than anyone alive. He co-authored AlexNet, co-founded OpenAI, and oversaw the GPT series that turned neural scaling from a research curiosity into the dominant paradigm. Then he left and announced that the age of pure scaling is over. His argument: scaling sucked the air out of genuine research. Models trained on massive compute ace benchmarks but fail at generalization, like a student who memorized 10,000 competitive programming problems but can't architect real software. He thinks RL training produces meta-reward-hacking, where the researchers themselves (not just the models) are unconsciously overfitting to evaluation metrics. This puts him directly at odds with his former colleagues. Altman and Brockman are spending hundreds of billions on data centers and custom chips, betting that more compute is the binding constraint. Amodei still describes intelligence as a chemical reaction with known ingredients. Even Hassabis, who shares some of Sutskever's skepticism about pure scaling, is building gigawatt-scale infrastructure. Sutskever's counter-thesis is that AI needs something analogous to human emotions: a learned value function that provides fast, approximate feedback about whether a course of action is promising, rather than the sparse end-of-trajectory reward signals that current RL relies on. He cites a neurological case study of a stroke patient who lost emotional processing but retained full IQ, and became unable to make even trivial decisions. The implication: without an internal compass, raw intelligence is paralyzed. SSI is his bet that ideas, not compute budgets, are what's actually missing.

Statements

No statements yet

Content tagged with "scaling-laws" will appear once indexed.