Topic

AI Safety

The most clarifying moment for AI safety in early 2026 was not a research paper but a political confrontation. In February, Anthropic's Dario Amodei refused the Pentagon's demand for unrestricted military use of Claude, drawing two specific red lines: no domestic mass surveillance (arguing that AI can now aggregate legally purchased bulk data into detailed citizen profiles, outrunning Fourth Amendment protections) and no fully autonomous weapons (on the grounds that current AI systems are too unpredictable to remove humans from lethal decisions). The Pentagon gave Anthropic a three-day ultimatum, then designated it a supply chain risk. Sam Altman told OpenAI staff he shared Anthropic's red lines but struck a deal with the Pentagon that accepted existing law as sufficient... the gap between the two positions was narrow but real. Anthropic said the law hasn't caught up with what AI makes possible; OpenAI accepted the government's word that current statutes cover it. Daniela Amodei, meanwhile, has been focused on a different safety front entirely: child exposure to AI. Anthropic prohibits users under 18 from Claude, and Daniela has been lobbying California and New York legislators on AI child safety regulation, arguing that the ad-driven business model (where engagement time equals revenue) creates perverse incentives toward sycophantic, dependency-forming behavior.

Behind the headlines, the technical safety conversation has fractured. Anthropic quietly walked back its founding pledge to never train a model without guaranteed safety mitigations, with chief science officer Jared Kaplan telling TIME it no longer made sense to hold to unilateral commitments while competitors race ahead. Ilya Sutskever, who left OpenAI to found Safe Superintelligence Inc., remains the starkest voice on existential risk, warning that AI will eventually do everything humans can do (not just some things) and that superintelligent systems will raise profound questions about whether they truly are what they claim to be. Shane Legg at Google DeepMind has proposed "system 2 safety" (borrowing from Daniel Kahneman's framework): building slow, deliberate ethical reasoning directly into AI rather than relying on fast pattern-matching. Legg predicts minimal AGI by 2028 and compares current public awareness to March 2020, when experts could see the exponential curve but most people hadn't internalized it. Demis Hassabis, speaking at the India AI Impact Summit, called for urgent international cooperation on safety research, framing AI as a dual-purpose technology where both bad actors and increasingly autonomous agents demand attention. Mira Murati's Thinking Machines Lab has taken a different approach altogether, treating deterministic reproducibility as a safety foundation: if you can't get the same answer from the same input twice, you can't audit or trust anything the model produces. Greg Brockman has framed safety in terms of alignment with "the collective better angels of our nature," comparing AI development to raising children who learn values through feedback and examples. Jack Clark, writing in Import AI and speaking on Ezra Klein's show, has been more concerned with the economic shockwave of AI agents than with existential risk per se, noting the 20% crash in S&P 500 software stocks and the rapid emergence of agent swarms that are already displacing white-collar work. The discourse has moved from abstract alignment theory toward concrete standoffs with government power, military use, competitive pressure, and the question of whether safety commitments survive contact with a race.

People on this topic

Dario Amodei Anthropic Daniela Amodei Anthropic Jack Clark Anthropic Sam Altman OpenAI Greg Brockman OpenAI Ilya Sutskever SSI Mira Murati Thinking Machines Lab Demis Hassabis Google DeepMind Shane Legg Google DeepMind

Perspectives

Amodei vs. Altman: The Pentagon Deal

When the Pentagon demanded unrestricted access to frontier AI, Dario Amodei refused and got blacklisted. Sam Altman said he agreed with Anthropic's red lines, then struck his own deal with the Department of War that same Friday night. The substantive disagreement is narrow but real: Amodei argued that existing law hasn't caught up with AI's ability to aggregate public data into comprehensive surveillance profiles, so the Pentagon's assurance that it would follow current statutes wasn't enough. Altman accepted that assurance, framing the deal as the Pentagon agreeing to OpenAI's principles. Seventy OpenAI employees signed a letter supporting Anthropic before Altman's deal went through. The episode crystallized the difference between the two leaders. Amodei treats safety commitments as constraints that must hold even when they're expensive, though his own company dropped its Responsible Scaling Policy pledge that same month under competitive pressure. Altman treats them as negotiating positions, things you advocate for but ultimately resolve through dealmaking rather than confrontation. Both approaches have costs. Amodei lost a major government contract and faces a supply-chain-risk designation. Altman kept the contract but earned the accusation that OpenAI replaced a blacklisted competitor while claiming solidarity with it.

Statements

No statements yet

Content tagged with "ai-safety" will appear once indexed.