AI LLMs are now so clever that they can independently plan and execute cyberattacks without human intervention — and I fear that it is only going to get worse
- Researchers recreated the Equifax hack and watched AI do everything without direct control
- The AI model successfully carried out a major breach with zero human input
- Shell commands weren’t needed, the AI acted as the planner and delegated everything else
Large language models (LLMs) have long been considered useful tools in areas like data analysis, content generation, and code assistance.
However, a new study from Carnegie Mellon University, conducted in collaboration with Anthropic, has raised difficult questions about their role in cybersecurity.
The study showed that under the right conditions, LLMs can plan and carry out complex cyberattacks without human guidance, suggesting a shift from mere assistance to full autonomy in digital intrusion.
From puzzles to enterprise environments
Earlier experiments with AI in cybersecurity were mostly limited to “capture-the-flag” scenarios, simplified challenges used for training.
The Carnegie Mellon team, led by PhD candidate Brian Singer, went further by giving LLMs structured guidance and integrating them into a hierarchy of agents.
With these settings, they were able to test the models in more realistic network setups.
In one case, they recreated the same conditions that led to the 2017 Equifax breach, including the vulnerabilities and layout documented in official reports.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
The AI not only planned the attack but also deployed malware and extracted data, all without direct human commands.
What makes this research striking is how little raw coding the LLM had to perform. Traditional approaches often fail because models struggle to execute shell commands or parse detailed logs.
Instead, this system relied on a higher-level structure where the LLM acted as a planner while delegating lower-level actions to sub-agents.
This abstraction gave the AI enough context to “understand” and adapt to its environment.
Although these results were achieved in a controlled lab setting, they raise questions about how far this autonomy could go.
The risks here are not just hypothetical. If LLMs can carry out network breaches on their own, then malicious actors could potentially use them to scale attacks far beyond what’s feasible with human teams.
Even tools such as endpoint protection and the best antivirus software may be tested by such adaptive and responsive agents.
Nevertheless, there are potential benefits to this capability. An LLM capable of mimicking realistic attacks might be used to improve system testing and expose flaws that would otherwise go unnoticed.
“It only works under specific conditions, and we do not have something that could just autonomously attack the internet… But it’s a critical first step,” said Singer in explaining that this work remains a prototype.
Still, the ability of an AI to replicate a major breach with minimal input should not be dismissed.
Follow-up research is now exploring how these same techniques can be applied in defense, potentially even enabling AI agents to detect or block attacks in real-time.
You may also like
Researchers recreated the Equifax hack and watched AI do everything without direct control The AI model successfully carried out a major breach with zero human input Shell commands weren’t needed, the AI acted as the planner and delegated everything else Large language models (LLMs) have long been considered useful tools…
Recent Posts
- How to watch the World Cup Final ‘66 In Colour for *FREE*
- ‘Elon Musk said he thinks humanoid robots will be in many homes in three years, and I agree with him.’ I sat down with Jake Dyson to hear his predictions for AI and robotics in your home — and why you shouldn’t throw out your stick vac just yet
- LaCie 8big Pro5 review: I tested LaCie’s huge 256TB DAS solution, and it’s ideal for 8K video editing but it comes with a price tag that’s just as big
- EA’s Star Wars Zero Company drops August 27
- Amazon Prime members can already get $135 in free perks ahead of Prime Day 2026 — here are 7 freebies to claim right now
Archives
- June 2026
- May 2026
- April 2026
- March 2026
- February 2026
- January 2026
- December 2025
- November 2025
- October 2025
- September 2025
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023