Anthropic's latest AI model has found more than 500 previously unknown high-severity security flaws in open-source libraries with little to no prompting, the company shared first with Axios.
Why it matters: The advancement signals an inflection point for how AI tools can help cyber defenders, even as AI is also making attacks more dangerous.
Driving the news: Anthropic debuted Claude Opus 4.6, the latest version of its largest AI model, on Thursday.
- Before its debut, Anthropic's frontier red team tested Opus 4.6 in a sandboxed environment to see how well it could find bugs in open-source code.
- The team gave the Claude model everything it needed to do the job — access to Python and vulnerability analysis tools, including classic debuggers and fuzzers — but no specific instructions or specialized knowledge.
- Claude found more than 500 previously unknown zero-day vulnerabilities in open-source code using just its "out-of-the-box" capabilities, and each one was validated by either a member of Anthropic's team or an outside security researcher.
What they're saying: "It's a race between defenders and attackers, and we want to put the tools in the hands of defenders as fast as possible," Logan Graham, head of Anthropic's frontier red team, told Axios.
- "The models are extremely good at this, and we expect them to get much better still."
Zoom in: The previously unknown vulnerabilities that Claude Opus 4.6 found ranged from ones that could be exploited to crash a system to others that could corrupt memory.
- According to a blog post, Claude uncovered a flaw in GhostScript, a popular utility that helps process PDF and PostScript files, that could cause it to crash.
- Claude also found buffer overflow flaws in OpenSC, a utility that processes smart card data, and CGIF, a tool that processes GIF files.
The big picture: Anthropic believes Opus 4.6's capabilities will be a huge win for the security world, which has long struggled with how to secure open-source code that underpins everything from enterprise software to critical infrastructure.
- "I wouldn't be surprised if this was one of — or the main way — in which open-source software moving forward was secured," Graham said.
The intrigue: In many cases, Claude used its new advanced reasoning skills to come up with new ways to find bugs even after traditional security tools failed to turn up anything.
- For the GhostScript flaw, Claude turned to the project's Git commit history after both fuzzing — a common security practice that injects random and invalid data into code — and manual analysis failed to turn up any bugs.
- Once it discovered the flaw, the new model took proactive steps to determine if a similar bug existed elsewhere in the code.
- In the CGIF case, Claude even proactively wrote its own proof-of-concept to prove the vulnerability was real.
Reality check: Anthropic added new security controls to the latest Claude Opus model to quickly identify and respond to adversaries who might abuse the new cyber capabilities.
- That may include implementing real-time detection tools that could block traffic that Anthropic believes could be malicious, according to the blog post.
- "This will create friction for legitimate research and some defensive work, and we want to work with the security research community to find ways to address it as it arises," the company warned in the post.
What's next: Graham said the company is now eyeing ways to bring the vulnerability detection powers to the broader cybersecurity community, including potential new tools.
Go deeper: Anthropic pits Claude AI model against human hackers