pwshub.com

Anthropic releases improved Claude models that can control your computer

Leading artificial intelligence firm Anthropic PBC today introduced new Claude 3.5 Sonnet and Claude 3.5 Haiku generative AI models with significantly upgraded capabilities over their predecessors.

The upgraded Sonnet arrived within four months after the initial launch of the initial model in June and received substantial gains in computer coding, which it was already designed to excel at. Haiku is Anthropic’s fastest model, and the company said the enhanced version has improvements in every skill and now surpasses Claude 3 Opus, the largest model in the previous generation.

In addition to the models, Anthropic also introduced a new way for models to interact with computers in public beta mode: computer use. By viewing the screen Claude Sonnet can interact with computers by moving the mouse, typing text and clicking buttons to interact with the user interface.

Anthropic touted Sonnet’s software engineering skills as part of what has become an arms race between rival frontier model developers to produce the best AI models for software developers. According to the company, the new model showed wide-ranging improvements across industry benchmarks, with strong gains in agentic coding and tool use.

“Early customer feedback suggests the upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding,” said Anthropic.

According to the company, Sonnet improved performance on the SWE-bench Verified leaderboard from 33.4% to 49% and scored higher than all publicly available models, including OpenAI o1-preview and specialized systems designed for agentic coding. GitLab tested the model for DevSecOps tasks, which require multistep reasoning across multiple domains such as development, testing, security and operations, discovered it delivered up to 10% better performance with no added latency.

Claude 3.5 Haiku is designed to be fast and affordable while providing extremely low latency. The company said it is designed to be well suited for customer-facing tasks where lots of interactions are occurring and high speed is paramount.

The upgraded Sonnet is now available to all users starting today and the new Claude 3.5 Haiku will be released later this month.

Claude can now use computers

Large language models ordinarily reason over text and images and with the addition of application programming interfaces they have also been able to use software tools to access data, update databases, send emails and more. Being able to “see” computer interfaces via screenshots gives them another capability, the ability to generally perceive and interact with user interfaces such as buttons, text fields and more.

Anthropic said that it’s given Claude 3.5 Sonnet the ability, via an API, that allows Claude to perceive and interact with UIs. Now developers can give Claude instructions such as “use data from my computer to fill out this form,” and it will take a screenshot, scan the page, and then enter text into the relevant parts of the visible page according to the data it has access to.

“We were surprised by how rapidly Claude generalized from the computer-use training we gave it on just a few pieces of simple software, such as a calculator and a text editor,” Anthropic said. “In combination with Claude’s other skills, this training granted it the remarkable ability to turn a user’s written prompt into a sequence of logical steps and then take actions on the computer.”

Anthropic stressed that the new computer use capability is experimental and can make mistakes so users should approach it with caution. For example, although there are things that humans do effortlessly, such as moving around a screen, scrolling, zooming, clicking-and-dragging, Claude has trouble with these actions.

In the researchers’ own tests, Claude has made some amusing blunders. This has included Claude accidentally clicking to stop a long-running screen recording, causing all footage to be lost. In another, the model took a break from a coding demo to browse Yellowstone National Park photos. It sounds like Claude is not too different from a normal developer.

Even in light of these bloopers, the technology represents a powerful leap forward in the sort of work that AI agents could do for users on their computers.

There’s also the concern that any new technology could become a tool for bad actors to produce spam, spread misinformation or commit fraud. This sort of technology could be used irresponsibly and that opens up a whole new set of ethical issues. Anthropic said the company has developed new classifiers and safeguards that can identify when “computer use” is being used and whether harm has occurred.

Source: siliconangle.com

Related stories
1 month ago - All eyes were on Nvidia’s earnings report this week as a proxy for the artificial intelligence economy, and even for the graphics chip giant, it was too much to live up to. Nvidia earnings disappointed, but really, how could they not?...
2 weeks ago - Google LLC is making a new version of its popular Gemini 1.5 Flash artificial intelligence model available that’s smaller and faster than the original. It’s called Gemini 1.5 Flash-8B, and it’s much more affordable, at half the...
3 weeks ago - This week brought yet another big shakeup at OpenAI, as Chief Technology Officer Mira Murati and others quit. But CEO Sam Altman seems to be cementing his control. And Chief Financial Officer Sarah Friar said in a memo that OpenAI’s...
3 weeks ago - (Bloomberg) -- OpenAI is discussing giving Chief Executive Officer Sam Altman a 7% equity stake in the company and restructuring to become a for-profit business, people familiar with the matter said, a major shift that would mark the...
1 month ago - Salesforce Inc. today announced the open-source release of its in-house family of “large action models,” called xLAM, that it says offer lower cost and higher accuracy than much bigger artificial intelligence large language models on the...
Other stories
2 minutes ago - Orders for TI's chips from the automotive market have faltered as customers struggle to clear existing inventory amid a years-long slump in demand stemming from stock-piling during the pandemic. An ongoing weakness in the industrial...
2 minutes ago - Nuclear energy-related stocks have taken off following major announcements from Alphabet, Inc.'s (NASDAQ:GOOG) (NASDAQ:GOOGL) Google and Amazon.com, Inc. (NASDAQ:AMZN) as the tech industry turns to nuclear energy to power AI data centers....
38 minutes ago - Freeform Future Corp., a startup developing 3D printers for producing metal parts, today announced that it raised $14 million in funding. The capital came from Nvidia Corp.’s NVentures fund and Boeing Co.’s AE Ventures. Besides providing...
38 minutes ago - In a move to address the growing challenge of measuring artificial intelligence investments, Grammarly Inc. has announced new return-on-investment measurement tools designed to help enterprises quantify the impact of AI-powered...
1 hour ago - Trump Media & Technology Group stock hit its highest level since July, adding to its recent rally.