Anthropic released its latest model Fable on Tuesday, billing it as a public and restricted model of its highly effective and much-hyped cybersecurity mannequin Mythos.
However not everyone seems to be proud of the restrictions, and a number of cybersecurity researchers and professionals have aired complaints on-line.
“[Fable] rejects any request that could possibly be tangentially cyber associated. Even innocuous duties like studying a weblog put up,” said Valentina “Chompie” Palmiotti, a widely known safety researcher who works at IBM X-Pressure.
When a immediate triggers its guardrails, Fable pauses the chat and says that its “security measures flagged this message for cybersecurity or biology matters.”
The guardrails have been put in place to restrict the danger that Fable could possibly be used to develop malware or compromise software program — a long-standing concern inside Anthropic. The restrictions on biology come from an identical concern round developing biological weapons.
When the AI giant released Mythos in April, it restricted the mannequin to a restricted variety of firms and organizations in what it referred to as Project Glasswing, an effort to deploy the mannequin to safe essential software program and infrastructure. Final week, Anthropic expanded access to Mythos to lots of of organizations in 15 international locations.
However regardless of the nice intentions, many cybersecurity specialists are nonetheless postpone by the haphazard nature of the restrictions. Matt Suiche, a cybersecurity veteran, informed TechCrunch that “in the event you ask it to write down safe code, it assumes it’s cybersecurity associated work as a substitute of software program engineering finest practices, and also you get downgraded.” Fable is programmed to fall again to Claude Opus 4.8 if it hits a guardrail. “It appears to be key phrase primarily based, so something within the lexical subject of ‘cybersecurity’ triggers the guardrails.”
Contact Us
Do you will have extra details about how hackers are utilizing AI? Or how cybersecuity firms are utilizing AI? We’d love to listen to from you. From a non-work machine and community, you may contact Lorenzo Franceschi-Bicchierai securely on Sign at +1 917 257 1382, or through Telegram and Keybase @lorenzofb, or email.
“However it’s comprehensible as we’re nonetheless within the early days and they’re nonetheless adapting their guardrails. I’m certain they’re going to evolve over time as Anthropic and different frontier mannequin firms will collaborate extra with the present new era of cybersecurity firms,” mentioned Suiche, who’s a member of the technical employees at Tolmo, an AI cybersecurity startup. “It’s higher to catch extra folks than not sufficient if you do such a launch and to chill out the guardrails over time.”
One other researcher griped on X that “even asking for a code evaluation” triggers Fable’s guardrails.
Anthropic didn’t instantly reply to a request for remark.
Other than guardrails inside its fashions, Anthropic requires cybersecurity professionals to use to the Cyber Verification Program. In the event that they get accredited, the candidates have fewer limitations on utilizing Claude for cybersecurity work. OpenAI has an identical program referred to as Trusted Access for Cyber.
While you buy by means of hyperlinks in our articles, we may earn a small commission. This doesn’t have an effect on our editorial independence.

