After backlash, Anthropic says its AI will now tell users when their request is being rejected or downgraded for national security concerns

ByMarco Quiroz-Gutierrez

Jun 11, 2026

Anthropic is changing course after facing criticism for quietly downgrading certain requests to its most capable AI model.

On Tuesday, the $965 billion company released a version of its most capable model, Mythos. Anthropic revealed Mythos in April, but held back any Mythos-class models from the public partly because the company said it was extremely adept at skirting cybersecurity defenses and was too dangerous to release.

This week, though, it opted to release the Mythos-class model, Fable 5, even as its capabilities “exceed those of every model we’ve previously made generally available,” according to Anthropic. Dianne Na Penn, Anthropic’s head of product management, research, and labs, previously told Fortune the company felt comfortable releasing Fable 5 because it feels “more confident with our safety guardrails in place.”

Yet, it was one of those guardrails, buried in a 319-page safety document, that earlier this week prompted a wave of backlash from AI researchers and other users online, and has now prompted the company to improve its transparency.

Information found in the Fable 5’s system card, a long document of safety disclosures, revealed the model would silently downgrade some requests related to advanced AI development. If, for example, an AI researcher is using Fable 5 to build their own AI, the program would default to a less capable model.

Some AI researchers complained Anthropic’s move would slow down AI development, including Jeremy Howard, the cofounder of nonprofit research group Fast.ai.

“Easy solution to slow down recursive AI self improvement: The lab with the top-ranked model must agree THEY must not use it for working on frontier AI. But everyone else should have access to it. By definition, this means the frontier doesn’t advance,” he wrote in a post on X.

On Wednesday, Anthropic’s critics got at least part of what they were asking for: visibility.

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible,” an Anthropic spokesperson said in a statement to Fortune. “Starting this week, flagged requests will visibly fall back to Opus 4.8. On the API, any flagged requests will return a reason for their refusal. You will see this every time it happens.”

The company will continue to downgrade some requests, partly because its terms of service prohibit its model from being used to create competing AI systems, a restriction the company said is standard across the industry.

Yet, it also cited national security as part of the reason why its large language model downgrades or rejects some requests. The company said it doesn’t want foreign adversaries to improve their AI capabilities to the detriment of the U.S.

“The U.S. and its allies hold an edge in frontier chips and the highly optimized software that runs them at full potential. These safeguards ensure Claude isn’t used to erode that advantage—by optimizing chips developed by those adversaries, for example,” the spokesperson said.

The company also emphasized its restrictions “do not affect the vast majority of coding and ML work.”

Anthropic’s change of course highlights how quickly AI safety measures are becoming a part of the national security conversation. Earlier this year, Anthropic faced a standoff with the Department of War after it refused to give it full access to Claude models. The company took issue with language that said the Pentagon could use its models for mass surveillance and autonomous weapons.

In the end, the Department of War labeled Anthropic a “supply chain risk” to national security, limiting defense contractors and military agencies from using its products. Earlier this month, Secretary of War Pete Hegseth rejected Anthropic’s petition to change this designation, setting the stage for a federal court battle that remains unresolved.

Anthropic’s move with Fable 5 also comes after the company filed confidentially for an IPO earlier this month. While the company has staked much of its public identity on being an AI lab that puts safety first, its initial decision to obscure when safeguards were being applied touched a nerve in the AI research community. In a statement, the company acknowledged it had gotten the issue wrong.

“We made the wrong tradeoff and we apologize for not getting the balance right,” an Anthropic spokesperson said.

This story was originally featured on Fortune.com

By Marco Quiroz-Gutierrez

Business

American taxpayers have spent $33 billion on sports stadiums. They got fewer seats—and higher prices

Jun 11, 2026 Catherina Gioino

Business

Abridge wants to be the operating system for medicine—and NVIDIA and Eli Lilly are helping build it

Jun 11, 2026 Lily Mae Lazarus

Business

As SpaceX goes public, a $100 billion shadow market faces a reckoning

Jun 11, 2026 Allie Garfinkle

World News

News

El Niño under way and threatens weather extremes, scientists say

June 11, 2026 SA_BWALK

An El Niño event has officially started, say US scientists, raising fears of extreme weather and higher temperatures. Read More

News

Iran says it struck ships in Strait of Hormuz after US launches new strikes

June 10, 2026 SA_BWALK

US Central Command said its latest airstrikes were responding to “unwarranted and continued aggression” from Iran. Read More

News

Trump and Iran trade new threats after strikes exchanged

June 10, 2026 SA_BWALK

The US president warns Iran “will have to pay the price” for taking too long to agree a deal, after Tehran vows retaliation to any attacks. Read More

News

Inside Myanmar, rebels are losing ground as military forces men into army

June 9, 2026 SA_BWALK

The BBC travels with rebels to frontline positions in Myanmar to see how the war is unfolding. Read More

News

Trump says Iran shot down US helicopter and vows to respond

June 9, 2026 SA_BWALK

Two crew members of the Apache helicopter that crashed following the attack were rescued by an American sea drone. Read More

After backlash, Anthropic says its AI will now tell users when their request is being rejected or downgraded for national security concerns

ByMarco Quiroz-Gutierrez

By Marco Quiroz-Gutierrez

Related Post

American taxpayers have spent $33 billion on sports stadiums. They got fewer seats—and higher prices

Abridge wants to be the operating system for medicine—and NVIDIA and Eli Lilly are helping build it

As SpaceX goes public, a $100 billion shadow market faces a reckoning

El Niño under way and threatens weather extremes, scientists say

Iran says it struck ships in Strait of Hormuz after US launches new strikes

Trump and Iran trade new threats after strikes exchanged

Inside Myanmar, rebels are losing ground as military forces men into army

Trump says Iran shot down US helicopter and vows to respond

You missed

Meet the Mid: Jaden Brock

Nicole Kidman Reacts to Taylor Swift & Haim Sisters’ Knicks T-Shirts – E! Online (US) – Top Stories

Tim Allen Details Challenges Preventing Home Improvement Revival – E! Online (US) – Top Stories

American taxpayers have spent $33 billion on sports stadiums. They got fewer seats—and higher prices