Anthropic launched two new AI fashions known as Claude Fable 5 and Claude Mythos 5 on Tuesday, which the corporate says have better capabilities than the Mythos Preview mannequin it released in April to a restricted set of tech business companions. Anthropic has mentioned the preliminary, restricted launch stemmed from considerations that the mannequin’s capabilities may very well be exploited by dangerous actors to develop hacking instruments that would catch defenders off guard.
Anthropic is at the moment solely releasing Claude Mythos 5 to a restricted set of business companions, lots of which obtained entry to Mythos Preview, and the corporate says it’s collaborating with the US authorities on the rollout.
Claude Fable 5, which is being publicly launched, makes use of the identical underlying mannequin as Mythos 5, however could have “guardrails” in place at launch, the corporate mentioned Tuesday, that can block the mannequin from answering many consumer questions associated to cybersecurity, biology, and chemistry. These requests will as a substitute be rerouted to an older AI mannequin, Claude Opus 4.8. If Anthropic suspects a consumer is attempting to conduct distillation—coaching a smaller AI mannequin off a bigger AI mannequin’s responses—on Claude Fable 5, these requests may even be rerouted to Claude Opus 4.8, the corporate says.
In an interview with WIRED, Anthropic’s head of product administration, Diane Penn, says that the corporate has been grappling with the query of how you can deal with Mythos’ software program vulnerability-discovery talents and different superior capabilities since earlier than its April launch, however that testing and consumer enter since then helped to hone the technique.
“We’re attempting to make enhancements in a approach that is useful, even when we do not have the right [solution] for each use case to start out,” Penn says. “Out of all of the totally different approaches, this emerged as essentially the most viable and the most effective one. We simply ended up feeling like this was the most effective product selection for customers to get the utmost worth out of Fable 5.”
For now, Penn says that the protecting mechanism is constructed to err on the facet of warning, which means some consumer queries could also be routed to the much less succesful AI mannequin even when they’re benign. Over time, Anthropic hopes to make its classifiers extra exact, however Penn says this was the one protected approach the corporate may launch the mannequin broadly at the moment.
The corporate mentioned on Tuesday that along with providing Claude Mythos 5 to Undertaking Glasswing companions, it is usually giving entry to “choose biology researchers.” Moreover, Anthropic famous in its blog post about Tuesday’s launch that it’s offering unrestricted variations to those small teams of shoppers “till our trusted entry program is accessible,” hinting at future plans to broaden entry much more. Because the Mythos launch in April, Anthropic has repeatedly emphasised that finally its opponents in each the non-public and even open weight areas will inevitably additionally supply fashions with Mythos-level capabilities.
The power for Claude Mythos and different new AI fashions to design hacking instruments that may discover and exploit vulnerabilities in each new and legacy software program has pressured tech firms and governments around the globe to safe their software program defenses earlier than AI fashions of this degree are made broadly accessible to attackers. Anthropic first launched Mythos to business companions below a consortium known as Undertaking Glasswing, with the concept this might give members a head begin in making ready their very own techniques and weighing world options to the menace earlier than a broader launch.
Anthropic wrote in an update about Undertaking Glasswing final week: “We’re working as rapidly as we are able to to soundly launch Mythos-level capabilities typically entry. To take action, we’ll want extremely strong safeguards that forestall the mannequin’s cyber capabilities from being misused—safeguards that we (and, to our information, all different AI builders) have but to develop.”

