Anthropic releases improved Claude models that can control your computer

Leading artificial intelligence firm Anthropic PBC today introduced new Claude 3.5 Sonnet and Claude 3.5 Haiku generative AI models with significantly upgraded capabilities over their predecessors.

The upgraded Sonnet arrived within four months after the initial launch of the initial model in June and received substantial gains in computer coding, which it was already designed to excel at. Haiku is Anthropic’s fastest model, and the company said the enhanced version has improvements in every skill and now surpasses Claude 3 Opus, the largest model in the previous generation.

In addition to the models, Anthropic also introduced a new way for models to interact with computers in public beta mode: computer use. By viewing the screen Claude Sonnet can interact with computers by moving the mouse, typing text and clicking buttons to interact with the user interface.

Anthropic touted Sonnet’s software engineering skills as part of what has become an arms race between rival frontier model developers to produce the best AI models for software developers. According to the company, the new model showed wide-ranging improvements across industry benchmarks, with strong gains in agentic coding and tool use.

“Early customer feedback suggests the upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding,” said Anthropic.

According to the company, Sonnet improved performance on the SWE-bench Verified leaderboard from 33.4% to 49% and scored higher than all publicly available models, including OpenAI o1-preview and specialized systems designed for agentic coding. GitLab tested the model for DevSecOps tasks, which require multistep reasoning across multiple domains such as development, testing, security and operations, discovered it delivered up to 10% better performance with no added latency.

Claude 3.5 Haiku is designed to be fast and affordable while providing extremely low latency. The company said it is designed to be well suited for customer-facing tasks where lots of interactions are occurring and high speed is paramount.

The upgraded Sonnet is now available to all users starting today and the new Claude 3.5 Haiku will be released later this month.

Claude can now use computers

Large language models ordinarily reason over text and images and with the addition of application programming interfaces they have also been able to use software tools to access data, update databases, send emails and more. Being able to “see” computer interfaces via screenshots gives them another capability, the ability to generally perceive and interact with user interfaces such as buttons, text fields and more.

Anthropic said that it’s given Claude 3.5 Sonnet the ability, via an API, that allows Claude to perceive and interact with UIs. Now developers can give Claude instructions such as “use data from my computer to fill out this form,” and it will take a screenshot, scan the page, and then enter text into the relevant parts of the visible page according to the data it has access to.

“We were surprised by how rapidly Claude generalized from the computer-use training we gave it on just a few pieces of simple software, such as a calculator and a text editor,” Anthropic said. “In combination with Claude’s other skills, this training granted it the remarkable ability to turn a user’s written prompt into a sequence of logical steps and then take actions on the computer.”

Anthropic stressed that the new computer use capability is experimental and can make mistakes so users should approach it with caution. For example, although there are things that humans do effortlessly, such as moving around a screen, scrolling, zooming, clicking-and-dragging, Claude has trouble with these actions.

In the researchers’ own tests, Claude has made some amusing blunders. This has included Claude accidentally clicking to stop a long-running screen recording, causing all footage to be lost. In another, the model took a break from a coding demo to browse Yellowstone National Park photos. It sounds like Claude is not too different from a normal developer.

Even in light of these bloopers, the technology represents a powerful leap forward in the sort of work that AI agents could do for users on their computers.

There’s also the concern that any new technology could become a tool for bad actors to produce spam, spread misinformation or commit fraud. This sort of technology could be used irresponsibly and that opens up a whole new set of ethical issues. Anthropic said the company has developed new classifiers and safeguards that can identify when “computer use” is being used and whether harm has occurred.

Claude can now use computers

Related stories

Other stories