Could Claude Opus 4 help build bioweapons? Anthropic cannot rule out catastrophic misuse risk
AI firm activates new security safeguards amid fears around the growing threat posed by the rapid evolution of model capabilities.

Anthropic has admitted it cannot totally rule out the risk of Claude Opus 4 being misused to acquire or develop chemical, biological, radiological, or nuclear weapons.
The AI firm described Claude Opus 4 as "the world’s best coding model", offering "sustained performance on complex, long-running tasks and agent workflows".
This new model has smashed benchmarks, hitting 72.7% on SWE-bench, and is designed to tackle tricky problem-solving tasks.
But Claude Opus is so powerful that Anthropic activated a security mechanism for the first time to mitigate the risk of it being used to create weapons of mass destruction.
Anthropic has not said that a bedroom nihilist could use Claude Opus 4 to spin up a nuke at home and then wipe out human civilisation. Far from it, in fact.
However, the new model has passed the threshold required to trigger the company's AI Safety Level 3 (ASL-3) Deployment and Security Standards.
"ASL-3 refers to systems that substantially increase the risk of catastrophic misuse... or show low-level autonomous capabilities," Anthropic previously wrote.
It believes that biological weapons "account for the vast majority of the risk" potentially posed by the model, although it is evaluating a "potential expansion in scope" to other weapons.
Protecting the world from AI misuse
In an announcement announcing the beefed-up security measures, Anthropic said the ASL-3 Security Standard makes it "harder to steal model weights", which should stop bad actors from copying the underlying AI code and data that powers the model.
A corresponding "Deployment Standard" has also been activated to "limit the risk of Claude being misused specifically for the development or acquisition of chemical, biological, radiological, and nuclear weapons (CBRN)."
"We are deploying Claude Opus 4 with our ASL-3 measures as a precautionary and provisional action," it wrote. "To be clear, we have not yet determined whether Claude Opus 4 has definitively passed the Capabilities Threshold that requires ASL-3 protections.
"Rather, due to continued improvements in CBRN-related knowledge and capabilities, we have determined that clearly ruling out ASL-3 risks is not possible for Claude Opus 4 in the way it was for every previous model, and more detailed study is required to conclusively assess the model’s level of risk."
The truth about catastrophic risk
Whilst words like "catastrophic" sound scary, at this stage, Claude Opus could only really help experts with deep knowledge build weapons of destruction - and even this point is not certain.
Version 4 of the model showed “substantially greater capabilities in CBRN-related evaluations” than previous models, including “stronger performance on virus acquisition tasks, more concerning behaviour in expert red-teaming sessions, and enhanced tool use and agentic workflows," according to an Anthropic report.
In other words, it's behaving in a way that is a little more unnerving than older models and appears to be better at tasks that would be useful in designing a bio weapon, which is the most likely threat because it requires fewer resources and specialised equipment than creating nukes.
Again, this does not mean that Claude Opus 4 will let some crazed terrorist cook up airborne Ebola in their mother's basement.
READ MORE: Humanity faces "gradual disempowerment" rather than an AI apocalypse, researchers warn
“The CBRN capability threshold for the ASL-3 Standard focuses on individuals or groups with basic technical backgrounds (e.g. undergraduate STEM degrees) attempting to use AI models to significantly help them create/obtain and deploy CBRN weapons,” Anthropic added.
It added: "The processes needed to generate these threats are knowledge-intensive, skill-intensive, prone to failure, and frequently have one or more bottleneck steps."
So at this stage, Anthropic is taking a belt and braces approach to security, which is to be welcomed.
"Proactively enabling a higher standard of safety and security simplifies model releases while allowing us to learn from experience by iteratively improving our defenses and reducing their impact on users," it said.
What are Anthropic's AI Safety Level 3 Protections?
Switching on ASL-3 protections involves implementing “deployment measures” that are “narrowly focused” on preventing the model from assisting with the creation of CBRN weapons.
Security safeguards include limiting universal jailbreaks, which Anthropic described as “systematic attacks that allow attackers to circumvent our guardrails”,
“We have developed a three-part approach: making the system more difficult to jailbreak, detecting jailbreaks when they do occur, and iteratively improving our defences,” it added.
The AI company has implemented “Constitutional Classifiers”, in which a system using a real-time classifier guards trained on synthetic data monitors model inputs and outputs to block a “narrow class of harmful CBRN information” - meaning that it shouldn't be too censorious and refuse innocent requests.
READ MORE: Degenerative AI: ChatGPT jailbreaking, the NSFW underground and an emerging global threat
Anthropic also instituted a “wider monitoring system” including a bug bounty program focused on stress-testing these Constitutional Classifiers, offline classification systems, as well as threat intelligence partnerships to “quickly identify and respond to potential universal jailbreaks that would enable CBRN misuse”.
"Our approach involves more than 100 different security controls that combine preventive controls with detection mechanisms, primarily targeting threats from sophisticated non-state actors from initial entry points through lateral movement to final extraction," it continued.
Do you have a story or insights to share? Get in touch and let us know.