Anthropic Offers Mythos Upgrade for Cyber Partners and a âSafeâ Version for the Rest of You
Anthropic released two new AI models called Claude Fable 5 and Claude Mythos 5 on Tuesday, which the company says have greater capabilities than the
Anthropic released two new AI models called Claude Fable 5 and Claude Mythos 5 on Tuesday, which the company says have greater capabilities than the Mythos Preview model it released in April to a limited set of tech industry partners. Anthropic has said the initial, limited release stemmed from concerns that the modelâs capabilities could be exploited by bad actors to develop hacking tools that could catch defenders off guard. Anthropic is currently only releasing Claude Mythos 5 to a limited set of industry partners, many of which received access to Mythos Preview, and the company says it is collaborating with the US government on the rollout. Claude Fable 5, which is being publicly released, uses the same underlying model as Mythos 5, but will have âguardrailsâ in place at launch, the company said Tuesday, that will block the model from answering many user questions related to cybersecurity, biology, and chemistry. These requests will instead be rerouted to an older AI model, Claude Opus 4.8.
If Anthropic suspects a user is trying to conduct distillationâtraining a smaller AI model off a larger AI modelâs responsesâon Claude Fable 5, those requests will also be rerouted to Claude Opus 4.8, the company says. In an interview with WIRED, Anthropicâs head of product management, Diane Penn, says that the company has been grappling with the question of how to handle Mythosâ software vulnerability-discovery abilities and other advanced capabilities since before its April release, but that testing and user input since then helped to hone the strategy. âWe're trying to make improvements in a way that's beneficial, even if we don't have the perfect [solution] for every use case to start,â Penn says. âOut of all the different approaches, this emerged as the most viable and the best one. We just ended up feeling like this was the best product choice for users to get the maximum value out of Fable 5.â For now, Penn says that the protective mechanism is built to err on the side of caution, meaning some user queries may be routed to the less capable AI model even if theyâre benign.
Over time, Anthropic hopes to make its classifiers more precise, but Penn says this was the only safe way the company could release the model broadly at this time. The company said on Tuesday that in addition to offering Claude Mythos 5 to Project Glasswing partners, it is also giving access to âselect biology researchers.â Additionally, Anthropic noted in its blog post about Tuesdayâs launch that it is providing unrestricted versions to these small groups of customers âuntil our trusted access program is available,â hinting at future plans to expand access even more. Since the Mythos launch in April, Anthropic has repeatedly emphasized that eventually its competitors in both the private and even open weight spaces will inevitably also offer models with Mythos-level capabilities. The ability for Claude Mythos and other new AI models to design hacking tools that can find and exploit vulnerabilities in both new and legacy software has forced tech companies and governments around the world to secure their software defenses before AI models of this level are made broadly available to attackers.
