Topic
Scaling Laws
The bet that made these companies says pour in data and compute, get intelligence out. In 2025-26 the founders started quietly arguing about whether it still holds.
Every founder in this archive owes their company to one observation: feed a neural network more data and more compute, and its intelligence climbs at a predictable rate. Dario Amodei calls it "the product of a chemical reaction" ... put in the ingredients of data and model size, get intelligence out (People by WTF, Feb 2026). It is the closest thing the field has to a physical law, and it is also the thing they have started, very carefully, to disagree about. The fault line in 2025-26 is not whether scaling works. It is whether scaling alone gets you to the finish line, and why the models that ace every benchmark still can't reliably hold down a job.
The founding bet, and who saw it first
Amodei's account is the origin myth. He says he caught the first "glimmers of the scaling laws" with GPT-2 in 2019, and that conviction was one of two things he was trying to sell OpenAI's leadership on before he left to found Anthropic. The pitch: "if you scale up models ... again, there are a few modifications like RL, but not really very much, it's pretty close to pure scaling ... you find incredible increases in performance" (People by WTF, Feb 2026). Note the hedge ... "a few modifications like RL." That phrase is doing more work in 2026 than it did in 2019.
Jack Clark, who was in the room, tells the same story with more dread. He remembers walking around OpenAI's Mission office with Amodei feeling "like financial traders before the financial crisis. We knew something was about to happen" (The Curve, Nov 2025). His verdict on the years since is flat: "the scaling laws have delivered." And his metaphor is the sharpest in the whole archive ... the technology is "something we grow rather than make." You set initial conditions, stick a scaffold in the ground, and out comes complexity you couldn't have designed. The labs are now "betting fractions of the US GDP that it will get better."
That is the consensus floor. More compute, better algorithms, more data, more capability. Where they split is everything above the floor.
The believers are getting hedgier
Listen closely and even the true believers are revising. Sam Altman still narrates pure momentum ... "we keep discovering better algorithms. So we keep finding steeper and steeper scaling laws" ... and predicts the next eighteen months take capability "from like 10 to 100" (Khosla Ventures, Sep 2025). But he concedes the inputs "have not changed very much" and that he wishes he "had like a deeper and more insightful thing to say" than better algorithms, bigger computers, more data.
Demis Hassabis is the most precise about the slowdown. "Scaling laws are going very well ... may not be not as fast as it was a couple of years ago. So there's some talk of diminishing returns" ... but he plants himself "somewhere in the middle" between no returns and exponential, where "there's very good returns and that's worth doing" (AI Nutshell, Feb 2026). Crucially, he doesn't think scaling finishes the job: getting to AGI "may be that there's one or two big innovation still needed," because today's models are "jagged intelligences" ... brilliant at the bar exam, incapable of continual learning or generating a genuinely new scientific hypothesis. His bet is convergence: keep scaling Gemini-style foundation models "as big and as powerful as we can," but bolt on world models (DeepMind's Genie, Veo) that learn physics and causality rather than the next word.
Mira Murati, back in late 2024, had already filed the careful version: the scaling law is "not literally a law, but an observation" that predictably turns data, compute and model size into capability (WIRED, Dec 2024). She waved off the plateau crowd ... "current evidence shows that the progress will likely continue" ... while flagging the data wall (answer: synthetic data) and the compute ramp she described as a billion dollars this year, "a factor of 10 to 10 billion" next, "a hundred billion" after. Greg Brockman and Altman are now living that ramp: a 10-gigawatt custom-chip deal with Broadcom, described as "the biggest joint industrial project in human history," built on the conviction that you "melt sand, run energy through it and get intelligence out the other end" (OpenAI x Broadcom, Oct 2025).
Ilya Sutskever breaks ranks
The genuine fault line runs through the man who arguably invented the scaling era. Ilya Sutskever, now at SSI, is openly puzzled: "the models seem smarter than their economic impact would imply." They crush hard evals, then reintroduce a bug they just fixed and chirp "you're absolutely right" twice in a row. "It's almost like ... how is that possible? I'm not sure" (Dec 2025).
His explanation is a quiet indictment of the whole post-2023 paradigm. The reinforcement-learning era, he argues, has the labs reverse-engineering training from benchmarks: "people take inspiration from the evals," building RL environments designed to make the scores "look great." "The real reward hacking is human researchers who are too focused on the evals." His analogy is the competitive programmer who grinds 10,000 hours and the one who needs only 100 ... the second has the "it factor," generalizes, wins the career. "The models are much more like the first student, but even more." Translation: scaling has been buying memorization dressed up as intelligence, and nobody quite understands why pre-training generalizes while RL mostly doesn't.
So the spectrum runs from Clark and Amodei ("the proof keeps coming, there's very little time") through Hassabis and Murati (scaling works, but one or two missing pieces remain) to Sutskever (the curves are real but we've been measuring the wrong thing). Same data, three readings.
The economics: smart models, stubborn world
The 2025-26 reframing is economic, and it is where the optimism gets uneven. Daniela Amodei's Economic Index report found AI following a "mostly predictable adoption path ... unfortunately the folks that have the most are going to be the first to adopt it" ... rich countries, rich users, MIT computer scientists running Claude Code first (Sixth Street, Feb 2026). Faster than dial-up or Google, but the same rich-first shape, with a real risk of "big regions of the world that are left behind."
Clark surfaces the counter-trend: the economist's case that even total automation leaves a boom in "the human touch" ... a normal good whose demand rises with income (Import AI 445, Feb 2026). Altman lands in the same place from the other side, insisting "there are a lot of jobs that I don't think you want an AI to do," that a mediocre human teacher may beat a great AI one because "biological programming is just very difficult to overcome" (Khosla Ventures, Sep 2025). Hassabis is blunter on scale: like the industrial revolution "but maybe 10 times bigger, 10 times faster," demanding "new economic models probably."
And Sutskever's disconnect is the skunk at the party. If models are this smart on paper, the impact should already be here. It isn't ... yet. The bet for the next decade isn't whether the curves keep climbing. It's whether the intelligence they produce can finally do the dishes, not just the exam.
The hammer came off the line and said "I am a hammer." Everyone's still arguing about what it's good for.
People on this topic
Perspectives
Sutskever vs. the Scaling Consensus
Ilya Sutskever arguably proved scaling laws work better than anyone alive. He co-authored AlexNet, co-founded OpenAI, and oversaw the GPT series that turned neural scaling from a research curiosity into the dominant paradigm. Then he left and announced that the age of pure scaling is over. His argument: scaling sucked the air out of genuine research. Models trained on massive compute ace benchmarks but fail at generalization, like a student who memorized 10,000 competitive programming problems but can't architect real software. He thinks RL training produces meta-reward-hacking, where the researchers themselves (not just the models) are unconsciously overfitting to evaluation metrics. This puts him directly at odds with his former colleagues. Altman and Brockman are spending hundreds of billions on data centers and custom chips, betting that more compute is the binding constraint. Amodei still describes intelligence as a chemical reaction with known ingredients. Even Hassabis, who shares some of Sutskever's skepticism about pure scaling, is building gigawatt-scale infrastructure. Sutskever's counter-thesis is that AI needs something analogous to human emotions: a learned value function that provides fast, approximate feedback about whether a course of action is promising, rather than the sparse end-of-trajectory reward signals that current RL relies on. He cites a neurological case study of a stroke patient who lost emotional processing but retained full IQ, and became unable to make even trivial decisions. The implication: without an internal compass, raw intelligence is paralyzed. SSI is his bet that ideas, not compute budgets, are what's actually missing.
Statements
No statements yet
Content tagged with "scaling-laws" will appear once indexed.