Loading Now

Meet Claude Mythos: Anthropic accidentally leaks unreleased AI model with severe cybersecurity risks

Meet Claude Mythos: Anthropic accidentally leaks unreleased AI model with severe cybersecurity risks

Meet Claude Mythos: Anthropic accidentally leaks unreleased AI model with severe cybersecurity risks


Anthropic’s latest AI model has been leaked even before launch and has been making waves on social media. The new model, codenamed “Claude Mythos”, was accidentally revealed after descriptions of the model were stored in a publicly accessible data cache that was first reported by Fortune.

Anthropic confirms its ‘most capable’ model:

After the news of the AI model leak came to light, an Anthropic spokesperson confirmed its existence and noted that the model represented “a step change” in AI performance and was “the most capable we’ve built to date.”

Details about the new model were reportedly stored in a draft blog that was available in an unsecured and publicly searchable data store. Anthropic believes that its upcoming model poses unprecedented cybersecurity risks.

Anthropic had reportedly left nearly 3,000 assets linked to the blog post that had not been previously published on the company’s news or research sites.

The AI startup had reportedly left the draft blog post announcing Mythos in an unsecured, public data lake, which was found by senior AI security researcher Roy Paz. After being informed of the breach, Anthropic immediately removed the public’s ability to search and retrieve documents from the data store.

Anthropic blamed the leak on a “human error” in the configuration of its content management system (CMS), which led to the draft blog post being publicly accessible. The company also called the unpublished material in the data store “early drafts of content considered for publication.”

Claude Mythos and the new ‘Capybara’ tier

The most significant revelation from the leak is a draft blog post detailing Anthropic’s next-generation model. The draft blog post introduced a brand new tier of AI models called “Capybara”.

Notably, Anthropic divides its models into three tiers so far: Haiku (fastest), Sonnet (mid-tier), and Opus (largest/most capable). However, it seems like the company is looking to introduce Capybara as the next top-tier model, which could be larger, more capable, and more expensive than Opus.

According to the leaked document, Capybara achieves dramatically higher scores in software coding, academic reasoning, and cybersecurity-related tasks when compared to Anthropic’s previous best model, Claude Opus 4.6.

The draft blog also noted that Anthropic has completed training Claude Mythos, which is described as “by far the most powerful AI model we’ve ever developed.”

An Anthropic spokesperson, while speaking to Fortune on the new model, said, “We’re developing a general-purpose model with meaningful advances in reasoning, coding, and cybersecurity.”

“Given the strength of its capabilities, we’re being deliberate about how we release it. As is standard practice across the industry, we’re working with a small group of early access customers to test the model. We consider this model a step change and the most capable we’ve built to date,” the spokesperson added.

Anthropic cautious about releasing new model:

The draft blog post also reportedly talked about the cybersecurity risks associated with Claude Capybara.

“we want to understand the model’s potential near-term risks in the realm of cybersecurity—and share the results to help cyber defenders prepare,” the document reads.

Anthropic also reportedly warned that the model is “far ahead of any other AI model in cyber capabilities” and it could spark a “wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders”.

Post Comment