OpenAI Wants You to Know It Does Security Now. The Numbers Are Actually Impressive.

OpenAI has a security problem. Not the technical kind. The perception kind.

For months, the narrative has been that Anthropic has the safety credentials, Google has the infrastructure, and OpenAI has… ChatGPT with a growing list of security headaches. When you are competing against a company whose CEO literally wrote the book on AI safety, you need to change the story.

So on June 23, OpenAI did exactly that. A coordinated set of announcements covering GPT-5.5-Cyber, a new partner program, an open source bug hunting initiative and an updated Codex security plugin. The Register covered it with the headline “Yoo-hoo, look over here, we do that security stuff too!” which captures the tone perfectly. But the numbers behind the announcements are worth taking seriously.

The Model That Finds Bugs Better Than Humans

The updated GPT-5.5-Cyber is now OpenAI’s strongest vulnerability-finding model. The improvement over the preview version released in May is measurable and significant:

CyberGym (reproducing known vulnerabilities): up from 81.8% to 85.6%
ExploitGym (turning vulnerabilities into working exploits): up from 25.95% to 39.5%
SEC-bench Pro (long-horizon discovery and proof of concept generation): up from 63.1% to 69.8%

The 39.5 per cent on ExploitGym is the one that should make security teams sit up. That is not a theoretical model that can identify vulnerabilities in a lab. That is a model that can convert those vulnerabilities into working exploits at nearly 40 per cent success rate. In a field where professional penetration testers charge $1,000 per test, that capability at machine speed changes the economics of security testing.

Patch the Planet: Not Just Marketing

The “Patch the Planet” initiative, co-founded with Trail of Bits and launched with HackerOne, had first-week results that are hard to dismiss. Hundreds of bugs uncovered across 19 open source projects. 64 pull requests generated. 51 issues filed.

The projects include cURL, NATS, aiohttp, the Go project, Python itself, PyPI, Valkey, Sigstore and RustCrypto. These are not obscure libraries. These are the foundations of the modern internet.

One example stands out. The team used GPT-5.5-Cyber to build a full-scale fuzzing lab in under a day. A task that would take human experts two to three weeks. Another used Codex to build a CVE variant analysis pipeline in a single day. That is not incremental improvement. That is a step change in capability.

Codex at Scale

The Codex Security Plugin, which started as a research preview in March, has now scanned over 30 million commits across 30,000 codebases. It has produced approximately 70,000 human-verified fixes and over 500,000 AI-determined fixes.

The updated plugin can now triage and validate findings from existing scanners, bug-bounty reports and ticketing systems. It can generate patches at scale to close vulnerability backlogs. It can export to vulnerability management systems via SARIF files and CodeQL queries.

For organisations drowning in vulnerability backlogs, this is exactly the kind of tool that shifts the conversation from “how do we fix everything?” to “how do we prioritise the 500,000 fixes Codex just generated for us?” That is a much better problem to have.

The Velvet Rope Comes Down

OpenAI previously locked GPT-5.5-Cyber behind what it called a “velvet rope” – available only to trusted defenders. The Daybreak Cyber Partner Program now has around 30 security vendors and service providers, with more to be added.

Context matters here. These announcements come at a time when Anthropic is dealing with the fallout from its Mythos model, which raised national security concerns. The UK’s biggest banks were offered GPT-5.5 but excluded from Anthropic’s Glasswing program. The timing of OpenAI’s push is not accidental.

OpenAI has also had ongoing dialogue with the US government about the model and upcoming releases to avoid export control surprises. That is a mature approach, and one that signals they are thinking about the geopolitical implications of what they are building.

“The question is no longer whether AI can find and fix vulnerabilities faster than humans. It clearly can. The question is whether the security industry is ready to trust it at scale. OpenAI just made that question harder to avoid.”

What This Means

For Australian security teams, the implications are direct. The tools that were being tested six months ago are now in production. The GPT-5.5-Cyber model is finding real vulnerabilities in real codebases. Codex is fixing them at scale. The “Patch the Planet” initiative is securing the open source supply chain that every Australian organisation depends on.

OpenAI’s security push is genuine, the numbers are real, and the timing is strategic. Whether you trust the messenger or not, the capability is now on the table.

OpenAI Wants You to Know It Does Security Now. The Numbers Are Actually Impressive.

The Model That Finds Bugs Better Than Humans

Patch the Planet: Not Just Marketing

Codex at Scale

The Velvet Rope Comes Down

What This Means

Subscribe

Related articles

NVIDIA, Microsoft, Meta, and 50+ Companies Tell Washington Not to Lock Down Open AI

Europe Just Drew a Red Line on AI and Cybersecurity. Here Is What It Means

Australia Sets Rules for AI. The Hard Part Comes Next.

FLUX 3: How Black Forest Labs Is Bridging Video AI and Real-World Robots

The Open Source AI Revolution: When the World’s Biggest Models Became Free

Stay Connected

PhilipHall.com

Must Read

NVIDIA, Microsoft, Meta, and 50+ Companies Tell Washington Not to Lock Down Open AI

Europe Just Drew a Red Line on AI and Cybersecurity. Here Is What It Means

Australia Sets Rules for AI. The Hard Part Comes Next.

Subscribe