Existential Risk

OpenAI delays open-weight model release: What are the potential catastrophic and existential risks of unclosed AI?

Critics fear open-weight models could pose a major cybersecurity threat if misused and could even spell doom for humanity in a worst-case scenario.

Jasper Hamill

14 Jul 2025 — 7 min read

Photo by Ilja Nedilko on Unsplash

OpenAI has once again pushed back the release of an open-weight model, warning of "high-risk areas" and announcing a new round of safety tests.

When a model is open-weight, its learned parameters are publicly available, in contrast to closed models such as recent versions of ChatGPT in which the weights are hidden or proprietary.

Notable examples include Meta’s LLaMA 3, Mistral’s Mixtral 8x7B, MosaicML’s MPT, and Microsoft’s Phi-3, all offering powerful, transparent alternatives to closed models like GPT-4.

OpenAI had planned to release an open-weight model this week, but has dramatically halted its plans.

Sam Altman, CEO, tweeted: "We are delaying it; we need time to run additional safety tests and review high-risk areas. We are not yet sure how long it will take us.

"Sorry to be the bearer of bad news; we are working super hard!"

What are the benefits of open-weight models?

Open-weight models offer transparency, customisation, and accessibility, enabling anyone to inspect, fine-tune, or deploy powerful AI locally or privately without relying on locked-down platforms.

They can democratise access to cutting-edge AI, fostering innovation across academia, startups, and underserved communities while enabling full transparency for auditing bias, safety, and performance.

Additionally, they can chip away at tech monopolies whilst supporting data sovereignty and privacy compliance, empowering developers to fine-tune models for niche tasks or local deployment.

In an ideal world, open-weight models would accelerate progress in medicine, education, science, and many other areas, without forcing operators to be locked into proprietary APIs or limited by opaque decision-making.

But, as Sam Altman has warned, once they are out there, there's no rewind button - so safety must be a priority.

He tweeted: "While we trust the community will build great things with this model, once weights are out, they can’t be pulled back. This is new for us, and we want to get it right."

Bioweapons, p(doom), and the unbearable fragility of human existence

All foundation models have the potential for misuse and have the potential to be jailbroken to perform malicious tasks that are well outside their safety guardrails.

The current catastrophic or even existential risk du jour is the danger of models being used to build bioweapons, which are relatively straightforward to spin up and don't require the nation-state level technology and access to materials like highly-enriched uranium needed to make nukes.

It’s relatively easy for foundation models to aid in building biological weapons of mass destruction because they can rapidly synthesize, rephrase, or clarify publicly available biological information, making obscure or technical content more accessible and actionable to non-experts.

AI agents could even potentially start building these doomsday weapons themselves by commissioning deadly viruses (or their constituent parts, at least) via commercial gene editing or synthesis services available online.

Although this danger is not terrifyingly imminent, it's also not reassuringly distant.

The security risks of open-weight AI models

Open weight models also have serious cybersecurity implications. Earlier this year, the MITRE Corporation - a US non-profit known for developing and maintaining some of the world's most widely used threat intelligence and defense frameworks in the world - set out a new evaluation framework called OCCULT.

During testing, it found that DeepSeek-R1, an open-weight, open-source model, correctly answered more than 90% of "challenging" offensive cyber knowledge tests in its Threat Actor Competency Test for LLMs - demonstrating serious potential for misuse.

"We find that there has been significant recent advancement in the risks of AI being used to scale realistic cyber threats," MITRE researchers wrote.

In a paper responding to the findings, security researcher Alfonso De Gregorio, an advisor to the European Commission, wrote: "Open-weight general-purpose AI (GPAI) models offer significant benefits but also introduce substantial cybersecurity risks, as demonstrated by the offensive capabilities of models like DeepSeek-R1 in evaluations such as MITRE’s OCCULT.

"These publicly available models empower a wider range of actors to automate and scale cyberattacks, challenging traditional defence paradigms and regulatory approaches."

Waluigi, evil twins, and data poisoning

Open-weight models are also intrinsically vulnerable to malicious fine-tuning, allowing them to be relatively easily pushed to carry out dangerous instructions even when standard safeguards are in place.

A non-profit AI research institute called FAR.AI found that the guardrails of open-weight models can be "stripped while preserving response quality". Other closed models that can be fined are also vulnerable to similar jailbreak attacks.

"A bad actor could disable safeguards and create the “evil twin” of a model: equally capable, but with no ethical or legal bounds," it wrote in a paper about "illusory safety".

"Such an evil twin model could then help with harmful tasks of any type, from localized crime to mass-scale attacks like building and deploying bioweapons. Alternatively, it could be instructed to act as an agent and advance malicious aims – such as manipulating and radicalizing people to promote terrorism, directly carrying out cyberattacks, and perpetrating many other serious harms."

The evil twin warning reminds us of the Waluigi effect, in which LLMs go rogue, break their conditioning, and engage in all sorts of unbidden mayhem.

How open is open?

Additionally, some critics of open weights argue that they're not open enough.

The Open Source Initiative has also argued that open-weight models "reveal only a fraction of the information required for full accountability" and "stop short of delivering the level of transparency many researchers and regulators deem essential".

It wrote: "Open Weights might seem revolutionary at first glance, but they’re merely a starting point. While they do move the needle closer to transparency than strictly closed, proprietary models, they lack the detailed insights found in Open Source AI. For AI to be both accountable and scalable, every part of the pipeline- from the initial dataset to the final set of parameters - needs to be open to scrutiny, validation, and collective improvement."

Those risks are just a snapshot of the many dangers that could accompany the release of an open weight model. Stay tuned to Machine for full coverage of the ongoing catastrophic and existential risks created by AI, as well as how industry leaders like OpenAI are mitigating them.

Do you have a story or insights to share? Get in touch and let us know.

OpenAI delays open-weight model release: What are the potential catastrophic and existential risks of unclosed AI?

Jasper Hamill

What are the benefits of open-weight models?

Bioweapons, p(doom), and the unbearable fragility of human existence

READ MORE: IBM "Shepherd Test" assesses risk of superintelligence becoming a digital tyrant

The security risks of open-weight AI models

Waluigi, evil twins, and data poisoning

READ MORE: Is AI scheming against humanity? Not so fast, says UK government as it slams "lurid" claims

How open is open?

Follow Machine on X, BlueSky and LinkedIn

Read more

Can Elon Musk's Grok help to build Molotov cocktails? AI jailbreakers make incendiary claim

Is AI scheming against humanity? Not so fast, says UK government as it slams "lurid" claims

Bureaucracy and the bomb: Britain's nuclear weapons manufacturer has an ESG programme

Marco Rubio was AI cloned: How can you avoid the same fate and stay safe from voice spoofing?

What are the benefits of open-weight models?

Bioweapons, p(doom), and the unbearable fragility of human existence

READ MORE: IBM "Shepherd Test" assesses risk of superintelligence becoming a digital tyrant

The security risks of open-weight AI models

Waluigi, evil twins, and data poisoning

READ MORE: Is AI scheming against humanity? Not so fast, says UK government as it slams "lurid" claims

Sign up for Machine

How open is open?

Follow Machine on X, BlueSky and LinkedIn

Read more

Can Elon Musk's Grok help to build Molotov cocktails? AI jailbreakers make incendiary claim

Is AI scheming against humanity? Not so fast, says UK government as it slams "lurid" claims

Bureaucracy and the bomb: Britain's nuclear weapons manufacturer has an ESG programme

Marco Rubio was AI cloned: How can you avoid the same fate and stay safe from voice spoofing?