Anthropic Warns AI Could Soon Improve Itself Without Human Input

US-based AI firm Anthropic warns that AI development is accelerating to a point where agents could soon build, train, and improve themselves without human input-recommending a development slowdown.

In a blog post Thursday, Marina Favaro and Anthropic co-founder Jack Clark said agents can already run code and delegate tasks to other agents, potentially on the verge of full autonomy.

“For most of AI’s history, humans drove every step. But at Anthropic, we are delegating a growing share of AI development to AI systems themselves, speeding up our work,” they wrote. “Taken far enough, that trend points to an AI system capable of fully autonomously designing its own successor.”

AI model improvement has roughly doubled every four months, with Anthropic’s Claude model authoring about 80% of code merged into its codebase. The role of humans is narrowing at each step.

“Once human- and AI-authored code quality reach parity, humans will stop writing code entirely and shift to only reviewing it. But if they can’t review code as quickly as Claude can generate it, human review will become the bottleneck,” they said.

Favaro and Clark added that slowing development to address the “immense” implications would be ideal-but warned that without global coordination, such a pause could leave everyone less safe if less cautious actors catch up.

In April, Anthropic ruled out releasing its Claude Mythos model to the public over cybersecurity concerns. Separately, tech leaders from Anthropic and OpenAI urged lawmakers to enact stronger guardrails, citing risks that AI could help bad actors overcome barriers to creating biological weapons.