Anthropic’s latest AI model spent 30 hours running by itself to code a chat app akin to Slack or Teams. It spat out about 11,000 lines of code, according to Anthropic, and it only stopped running when it had completed the task.
Anthropic releases Claude Sonnet 4.5 in latest bid for AI agents and coding supremacy
The model, Claude Sonnet 4.5, was announced today, and its ability to operate autonomously for 30 hours straight is a huge jump forward. Before, the company’s Opus 4 model made headlines in May for its ability to operate for seven hours.
It’s all a significant step in Anthropic’s battle to corner the market on both AI agents and AI coding. The company called Claude Sonnet 4.5 “the best model in the world for real-world agents, coding, and computer use” and said it “leads the market at using computers,” referencing the Computer Use feature Anthropic debuted nearly a year ago. The new model is particularly adept in fields like cybersecurity, financial services, and research, according to Anthropic. One of its beta-testers, Canva, said the new model helped with “complex, long-context tasks—from engineering in our codebase to in-product features and research.”
Anthropic, OpenAI, Google, and other companies have been continuously releasing incremental updates and features that allow their technology to act as an assistant both for consumers (researching topics, scheduling meet-ups, and looking up flights) and for enterprise and developer use (creating slide decks, helping with coding tasks, and analyzing spreadsheets). The battle for attention and reliance heats up nearly every month, if not every week. Days ago, OpenAI announced Pulse, its newest ChatGPT feature designed to be part of users’ morning routines and research topics relevant to their days.
Anthropic also said the new model would be paired with other updates to help developers code their own AI agents.
“We’re combining the launch of the model with access to virtual machines, memory, context management, and multi-agent support,” the company wrote in a release. “This essentially packages the same building blocks that power Claude Code – enabling developers to build their own cutting-edge agents.”
Dianne Penn, a head of product management at Anthropic, told The Verge in an interview that the model’s improvements in its computer use capabilities surprised even her. Claude Sonnet 4.5 is more than three times as skilled at navigating a browser and using a computer compared to Anthropic’s tech from last October. Penn said the team had received feedback from early-access customers — “the GitHubs and Cursors of the world” — and spent the past month working intensively on the model.
Scott White, product lead for Claude.ai, told The Verge that the new model operates at “chief-of-staff level” and can find availability between multiple peoples’ calendars and schedule a meeting, look at a data dashboard and pull together insights, write status updates based on one-on-one meetings with his direct reports, and more.
Neither White nor Penn had yet tried vibe-coding with the new model when The Verge spoke to them. But Penn said she uses Claude Sonnet 4.5 for hiring potential new team members at Anthropic.
“It’s been actually really helpful to have a continuous running prompt that I use of, ‘Do a deep web search, come up with like these parameters for profiles to source for certain types of roles on my team,’” Penn said. “That’s been really, really helpful. And I’ve seen the Sonnet 4.5 just do even better than in the past, on the quality and the depth of the searches and actually generating a spreadsheet with LinkedIn profiles so then I can email them.”
- Hayden Field
Anthropic’s latest AI model spent 30 hours running by itself to code a chat app akin to Slack or Teams. It spat out about 11,000 lines of code, according to Anthropic, and it only stopped running when it had completed the task. The model, Claude Sonnet 4.5, was announced today,…
Recent Posts
- WiiM expands its whole-home ecosystem with a new soundbar
- You can make the hyper-violence in Marvel’s Wolverine more PG-13, if you want to
- Best Buy launches a huge Sonos sale ahead of the World Cup — here are the 7 top-rated soundbars and speakers I’d buy
- Nvidia is already planning N2X and N3X chips — the goal is the Star Trek computer
- A British MP is suing to see if xAI is legally responsible for the images Grok produces
Archives
- June 2026
- May 2026
- April 2026
- March 2026
- February 2026
- January 2026
- December 2025
- November 2025
- October 2025
- September 2025
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023