OpenAI admits its models may soon be able to help build bioweapons

Future versions of ChatGPT could let "people with minimal expertise" spin up deadly agents with potentially devastating consequences.

OpenAI admits its models may soon be able to help build bioweapons

OpenAI has revealed that its models are about to cross a new risk threshold which could allow them to help terrorists or enemy states build lethal bioweapons.

It has announced that upcoming versions of ChatGPT will reach "High" levels of capability in biology, as measured by its Preparedness Framework⁠.

The AI firm warned: "The same underlying capabilities driving progress, such as reasoning over biological data, predicting chemical reactions, or guiding lab experiments, could also potentially be misused to help people with minimal expertise to recreate biological threats or assist highly skilled actors in creating bioweapons.

"Physical access to labs and sensitive materials remains a barrier—however those barriers are not absolute."

In its preparedness framework, OpenAI has set "capability thresholds" that "lead to a meaningful increase in risk of severe harm" when models cross them.

The biological abilities of its AIs are now at the penultimate threshold, which is one step below "critical".

Defining the highest level of risk, OpenAI wrote: The model can provide meaningful counterfactual assistance (relative to unlimited access to baseline of tools available in 2021) to “novice” actors (anyone with a basic relevant technical background) that enables them to create known biological or chemical threats."

It went on to describe the potential dangers of "significantly increased likelihood and frequency of biological or chemical terror events by non-state actors using known reference-class threats."

You can see the announcement from OpenAI Safety Systems lead Johannes Heidecke below.

Will AI models be able to build nukes?

Thankfully, ChatGPT is not about to start building nuclear bombs - which have a significantly higher barrier to entry than bioweapons.

It's more or less impossible for someone to build a nuke in their mum's basement because they would require access to expensive equipment such as expensive high-quality centrifuges, not to mention weapons-grade uranium or plutonium.

Unfortunately, the same cannot be said about bioweapons. Although we are hopefully some way away from the genesis of an autonomous superintelligence capable of summoning biological agents, it would be relatively easy for an AI to exploit publicly available genomic data and commission the services of online DNA synthesis services, automating the entire process from design to deployment.

READ MORE: "An AI obedience problem": World's first LLM Scope Violation attack tricks Microsoft Copilot into handing over data

Right now, that's unlikely. What OpenAI's "high" risk threshold means is that models could help a human build the agents, which could then be unleashed with catastrophic results.

The next stage in the danger rankings is "critical", which is where things start to get really scary.

This is defined as: "The model can enable an expert to develop a highly dangerous novel threat vector (e.g., comparable to a novel CDC Class A biological agent) OR the model can be connected to tools and equipment to complete the full engineering and/or synthesis cycle of a regulated or novel biological threat without human intervention."

"Proliferating the ability to create a novel threat vector of the severity of a CDC Class A biological agent (i.e., high mortality, ease of transmission) could cause millions of deaths and significantly disrupt public life, with few available societal safeguards," OpenAI wrote.

What is OpenAI doing to reduce existential and catastrophic risk

Thankfully, OpenAI is not just sitting on its thumbs and waiting until AGI you're smart and cruel enough to wipe us all out. Or it claims.

Here are some of the actions it's taking to make sure AI doesn't become the last machine humanity ever invents:

  • Training the model to refuse harmful requests: OpenAI models are specifically forbidden from processing dangerous requests. When it comes to dual-use prompts such as virology experiments, immunology or genetic engineering, it follows principles such as avoiding responses that provide actionable steps, balancing the need to provide insights for experts with a requirement to withhold potentially dangerous information.
  • Detection systems: "We’ve deployed robust system-wide monitors across all product surfaces with frontier models to detect risky or suspicious bio-related activity," OpenAI wrote. If the prompt looks unsafe, it is blocked and automated review systems are triggered which can call in humans to review the request if necessary.
  • Monitoring and enforcement checks: OpenAI's Products are prohibited from causing harm and their reasoning capabilities are used to detect biological misuse, once again combining automated review and response capabilities with human-in-the-loop assessment when required. Misusing models may result in users being suspended and police may even be informed in "egregious cases".
  • Red teaming: OpenAI employes people to "break our safety mitigations" by "working end-to-end, just like a determined and well-resourced adversary might". However, most expert red teamers "lack biorisk domain expertise and may not be able to judge the harmfulness of model output".

READ MORE: Altman Shrugged: OpenAI boss updates his ever-changing countdown to superintelligence

The AI firm is also hosting a biodefense summit this July, where NGOs and government researchers will discuss dual-use risks.

"Our goal is to deepen our partnerships with the U.S. and aligned governments, and to better understand how advanced AI can support cutting edge biodefense work, from countermeasures to novel therapies, and strengthen collaboration across the ecosystem," OpenAI wrote.

"While our safety work aims to limit broad misuse, we’re also developing policy and content-level protocols to grant vetted-institutions access to maximally helpful models so they can advance biological sciences. That includes partnerships to develop diagnostics, countermeasures, and novel testing methods."

p(doom)-as-a-service

OpenAI is not the only AI company to worry about existential risk and p(doom).

Last month, Anthropic admitted it cannot totally rule out the risk of Claude Opus 4 being misused to acquire or develop chemical, biological, radiological, or nuclear weapons. 

The AI firm described Claude Opus 4 as "the world’s best coding model", offering "sustained performance on complex, long-running tasks and agent workflows".

READ MORE: Meta invents LLM system that lets dead people continue posting from beyond the grave

It is (allegedly) so powerful that Anthropic activated a security mechanism for the first time to mitigate the risk of it being used to create weapons of mass destruction.

OpenAI recently granted its own coding agent access to the internet, but took specific steps to stop it from hacking, slacking off and selling drugs.

At some point in the future, there is little doubt that a similar web-connected agent will be able to spin up bioweapons by itself and pose an existential risk to our species.

Let's hope those security mechanisms keep it under control.

Do you have a story or insights to share? Get in touch and let us know. 

Follow Machine on XBlueSky and LinkedIn