OpenAI Wants You to Know It Does Security Now. The Numbers Are Actually Impressive.

OpenAI has a security problem. Not the technical kind. The perception kind.

For months, the narrative has been that Anthropic has the safety credentials, Google has the infrastructure, and OpenAI has… ChatGPT with a growing list of security headaches. When you are competing against a company whose CEO literally wrote the book on AI safety, you need to change the story.

So on June 23, OpenAI did exactly that. A coordinated set of announcements covering GPT-5.5-Cyber, a new partner program, an open source bug hunting initiative and an updated Codex security plugin. The Register covered it with the headline “Yoo-hoo, look over here, we do that security stuff too!” which captures the tone perfectly. But the numbers behind the announcements are worth taking seriously.

The Model That Finds Bugs Better Than Humans

The updated GPT-5.5-Cyber is now OpenAI’s strongest vulnerability-finding model. The improvement over the preview version released in May is measurable and significant:

  • CyberGym (reproducing known vulnerabilities): up from 81.8% to 85.6%
  • ExploitGym (turning vulnerabilities into working exploits): up from 25.95% to 39.5%
  • SEC-bench Pro (long-horizon discovery and proof of concept generation): up from 63.1% to 69.8%

The 39.5 per cent on ExploitGym is the one that should make security teams sit up. That is not a theoretical model that can identify vulnerabilities in a lab. That is a model that can convert those vulnerabilities into working exploits at nearly 40 per cent success rate. In a field where professional penetration testers charge $1,000 per test, that capability at machine speed changes the economics of security testing.

Patch the Planet: Not Just Marketing

The “Patch the Planet” initiative, co-founded with Trail of Bits and launched with HackerOne, had first-week results that are hard to dismiss. Hundreds of bugs uncovered across 19 open source projects. 64 pull requests generated. 51 issues filed.

The projects include cURL, NATS, aiohttp, the Go project, Python itself, PyPI, Valkey, Sigstore and RustCrypto. These are not obscure libraries. These are the foundations of the modern internet.

One example stands out. The team used GPT-5.5-Cyber to build a full-scale fuzzing lab in under a day. A task that would take human experts two to three weeks. Another used Codex to build a CVE variant analysis pipeline in a single day. That is not incremental improvement. That is a step change in capability.

Codex at Scale

The Codex Security Plugin, which started as a research preview in March, has now scanned over 30 million commits across 30,000 codebases. It has produced approximately 70,000 human-verified fixes and over 500,000 AI-determined fixes.

The updated plugin can now triage and validate findings from existing scanners, bug-bounty reports and ticketing systems. It can generate patches at scale to close vulnerability backlogs. It can export to vulnerability management systems via SARIF files and CodeQL queries.

For organisations drowning in vulnerability backlogs, this is exactly the kind of tool that shifts the conversation from “how do we fix everything?” to “how do we prioritise the 500,000 fixes Codex just generated for us?” That is a much better problem to have.

The Velvet Rope Comes Down

OpenAI previously locked GPT-5.5-Cyber behind what it called a “velvet rope” – available only to trusted defenders. The Daybreak Cyber Partner Program now has around 30 security vendors and service providers, with more to be added.

Context matters here. These announcements come at a time when Anthropic is dealing with the fallout from its Mythos model, which raised national security concerns. The UK’s biggest banks were offered GPT-5.5 but excluded from Anthropic’s Glasswing program. The timing of OpenAI’s push is not accidental.

OpenAI has also had ongoing dialogue with the US government about the model and upcoming releases to avoid export control surprises. That is a mature approach, and one that signals they are thinking about the geopolitical implications of what they are building.

“The question is no longer whether AI can find and fix vulnerabilities faster than humans. It clearly can. The question is whether the security industry is ready to trust it at scale. OpenAI just made that question harder to avoid.”

What This Means

For Australian security teams, the implications are direct. The tools that were being tested six months ago are now in production. The GPT-5.5-Cyber model is finding real vulnerabilities in real codebases. Codex is fixing them at scale. The “Patch the Planet” initiative is securing the open source supply chain that every Australian organisation depends on.

OpenAI’s security push is genuine, the numbers are real, and the timing is strategic. Whether you trust the messenger or not, the capability is now on the table.

Subscribe

Related articles

OpenAI Built Its Own AI Chip in 9 Months. That Changes Things.

OpenAI revealed Jalapeño, its first custom inference chip co-built with Broadcom in nine months. The move reduces dependence on Nvidia and signals where AI is heading.

The Trump Administration Just Asked OpenAI to Limit GPT-5.6 to Government-Approved Partners

The U.S. government has asked OpenAI to limit GPT-5.6 to approved partners before public release — the third frontier model to face pre-approval, setting a clear precedent for AI release policy.

China’s AI Debate: Is the Public Consensus Real or Manufactured?

A balanced look at both sides of the China AI debate. Examining censorship, suppression of dissent alongside China's genuine technical achievements in AI development.

Your AI Agent Is Not a Chatbot: The Hermes Security Guide You Actually Need

AI agents require fundamentally different security approaches than chatbots. A practical guide to securing AI agent deployments in production environments.

Three AI Models, Three Different Futures: Fable, Fugu and GLM 5.2 Compared

Comparing three AI foundation models across safety, capability and commercial viability. A practical framework for understanding today's AI model choices.