In the span of a few weeks, three very different AI models hit the market. Each one represents a fundamentally different bet on how artificial intelligence should work. If you are trying to figure out which one matters for your work, you are not alone. Let me break it down.

The All-Star: Anthropic Claude Fable 5
Anthropic’s Fable 5 is the closest thing we have to a single, superhuman brain. It is a massive standalone model that outperforms every generally available AI on nearly every benchmark. Stripe used it to migrate a 50-million-line codebase in a day. It plays Pokemon from raw screenshots. It designs 3D-printable objects. It scores highest on senior-level finance reasoning.
Fable 5 costs $10 per million input tokens and $50 per million output tokens. It also comes with safety classifiers that automatically fall back to Opus 4.8 on sensitive topics like cybersecurity, biology and chemistry. Anthropic says more than 95% of sessions run without a fallback, but the guardrails are real.
The catch? Just days after launch, the US government suspended access to Fable 5 (and its unrestricted sibling Mythos 5) under an export control directive. The model that was supposed to be generally available is now restricted while Anthropic works with Washington to resolve the situation.
The Team Captain: Sakana Fugu Ultra
Sakana AI, a Japanese startup founded by the co-authors of the original “Attention Is All You Need” paper, took a completely different approach. Instead of building one giant model, they built a system that manages a team of other models.
Fugu is itself a language model trained to call other LLMs. When you give it a task, it decides which experts from its pool are best suited and coordinates their work. It uses Thinker, Worker and Verifier roles to break down problems. Everything runs through a single OpenAI-compatible API. You never see the complexity underneath.
The numbers are impressive. Fugu Ultra tops most coding and reasoning benchmarks, often beating the individual models it orchestrates. One beta user reported it caught more than twenty bugs in a code review where other tools flagged about three.
But it comes at a cost. Early testers report it is slow. A single coding task took 30 minutes in one review. The $200 Max plan yields less than three hours of heavy use per week. However, while benchmarks show parity with Fable 5, real-world hands-on tests suggest it still falls short on quality for creative and frontend tasks.
What Fugu does offer that no single model can is vendor independence. If one AI provider goes dark due to regulation or policy, Fugu reroutes to another model in its pool. For organisations that cannot afford to be locked into a single vendor, that is a powerful argument.
The Deep Sea Diver: Zhipu GLM 5.2
Zhipu’s GLM 5.2 (from Chinese AI lab Zhipu, now called Z.ai) is the wild card. It is fully open source under an MIT license with no regional restrictions. You can download the weights and run it on your own hardware. It supports a solid 1 million token context window, enough to hold several novels at once.
Its architecture uses something called IndexShare, which reduces computation by 2.9 times at long context lengths. It competes with Opus 4.8 on long-horizon coding benchmarks like FrontierSWE and PostTrainBench. It is the highest-ranked open-source model on every long-context engineering evaluation.
But it is not a standalone marvel like Fable. It trails closed models on reasoning, complex agentic tasks, and knowledge work. Its strength is in long-running coding projects where you need 1M token context and want to keep your data off someone else’s servers.
Zhipu went public in Hong Kong in January 2026 and has a market cap that has been volatile. The company is Chinese, which introduces its own set of geopolitical considerations for Australian enterprises.
Which one should you pick?
Honestly, it depends on what you are trying to do.
If you need the smartest single model for research, complex reasoning, or vision tasks, and access is not an issue, Fable 5 is still the benchmark. But “access is not an issue” is doing a lot of work right now.
If you are building critical infrastructure and cannot afford to depend on a single AI provider, Fugu gives you a hedge. The trade-off is speed and polish today for flexibility tomorrow.
If you are a developer shipping long-running coding projects and want full control over your data and costs, GLM 5.2 is the strongest open-source option available. No restrictions, no vendor lock-in, and a 1M token context that actually works.
The fact that these three approaches all launched in the same window tells you something. There is no single winning strategy in AI anymore. The frontier is fragmenting. However, for security professionals and enterprise buyers, that fragmentation is both a risk and an opportunity.
The age of one model to rule them all is over. We now have three different philosophies competing at once: one super-brain, one team captain, and one open deep diver. Smart organisations will understand all three.
What this means if you use Hermes (or any AI agent framework)
If you run an AI agent framework like Hermes, this comparison is not theoretical. It affects your config file today.
Model routing matters more than ever. The days of picking one model and sticking with it are ending. The smartest setup is routing by task difficulty: use Fable 5 (or Opus 4.8 if Fable is restricted) for complex multi-step reasoning, coding and security work, and a faster cheaper model for everyday queries. Hermes already supports this through configurable fallback chains.
Provider redundancy is not optional. The Fable 5 export ban proved that access to frontier models can vanish overnight. If your agent framework depends on a single provider, you are exposed. Fugu offers one answer to this by abstracting multiple providers behind one API. But you can also build the same resilience yourself: configure multiple providers in your agent’s model config and let it failover automatically. Any serious AI agent setup should support at least two independent providers for critical workflows.
Self-hosting is becoming viable. GLM 5.2 being open source under MIT license means you can run a capable coding model on your own hardware with full data control. For security professionals working with sensitive codebases or classified environments, that changes the conversation entirely. The gap between open and closed models is narrowing fast.
Export controls are a live risk. If your organisation relies on US-based AI providers, you need a contingency plan. The ban on Fable and Mythos was sudden and total. Australian enterprises that embedded those models into critical workflows had to scramble. Diversifying across providers and architectures is not nice to have. It is a risk management requirement.
The fragmentation of the AI frontier is accelerating. Smart operators are not betting on one model. They are building systems that can route, fall back and adapt as the landscape shifts.
What this means for coding agents: Claude Code, ZCode and OpenCode
If you use a coding agent like Claude Code, ZCode or OpenCode, these three models each bring something different to the terminal.
Claude Code with Fable 5 is the most powerful coding combo available, when you can get it. Fable 5 scores highest on Cognition’s FrontierCode evaluation for long-horizon production code. Stripe demonstrated a codebase-wide migration in a day that would have taken a team over two months. The catch is that Fable 5 access is currently restricted under US export controls, and even when available it comes with safety classifiers that can fall back to Opus 4.8 on certain queries.
GLM 5.2 and Claude Code is a surprisingly strong pairing. GLM 5.2 supports Claude Code natively through the GLM-5.2[1m] model flag, which activates the full 1M token context window. This makes it ideal for long-running coding sessions where you need the agent to hold an entire codebase in memory. On Terminal-Bench 2.1, GLM 5.2 scored 81.0 in its own harness and 82.7 when plugged into Claude Code, within striking distance of Opus 4.8. It also works with OpenCode and ZCode. The MIT license means no usage limits, no data leaving your infrastructure, and no export control surprises.
Fugu and Fugu Ultra present through a standard OpenAI-compatible API, which means any coding agent that supports that interface can use it. Sakana’s benchmarks show Fugu Ultra leading on SWE Bench Pro (73.7), TerminalBench 2.1 (82.1) and LiveCodeBench Pro (90.8). But the real-world experience is mixed. Early users report it is slow a single coding task can take 30 minutes and it burns through quota quickly. Where it shines is code review: one developer reported Fugu Ultra found more than twenty issues where other tools flagged about three.
The practical takeaway for coding agent users is to match the model to the task. Use Fable 5 (or Opus 4.8 as fallback) for complex architecture and refactoring work. Use GLM 5.2 for long-context sessions where you need the agent to understand the full codebase. Use Fugu Ultra for deep code review and security analysis where thoroughness matters more than speed. However, always configure a fallback chain because provider access can disappear without warning.
Privacy, security and compliance: what you need to know
Each of these three models has a fundamentally different privacy and security profile. Here is what you need to understand before committing to any of them.
Anthropic Fable 5 data handling. Anthropic’s API policy states it does not train on API traffic. Your prompts and outputs stay yours. But Fable 5’s safety classifiers are an extra layer of data processing. When a classifier triggers, your query is routed through Opus 4.8 instead, and the classification itself involves separate AI systems analysing your request. For sensitive or proprietary work, understand that classifiers are not transparent about what they flag or why. The export control situation is also fluid: access was suspended globally in June 2026, not just in restricted jurisdictions. If you build a workflow around Fable 5, have a fallback ready.
Sakana Fugu data routing. Fugu’s core value proposition is that it routes your queries across multiple third-party models. That means your data goes to whichever providers Fugu decides are best for each task. You can opt out of specific agents in the base Fugu tier, but Fugu Ultra’s pool is fixed. If you work with regulated data, health information, or classified code, you need to know exactly which models will process your data before you send it. Fugu’s routing is proprietary and not transparent at the query level. The company is working toward GDPR compliance but is not yet available in the EU or EEA.
Zhipu GLM 5.2 self-hosting. This is the only model of the three that you can fully control. The weights are available under MIT license on HuggingFace and ModelScope. You can run GLM 5.2 locally using vLLM, SGLang, ktransformers or other inference frameworks. No data leaves your infrastructure. No third party sees your prompts. No export controls can cut off your access. The trade-off is that you need the hardware: GLM 5.2 is a 744-billion-parameter Mixture-of-Experts model and requires significant GPU resources. If you can run it, it is the strongest privacy-preserving option available.
Geopolitical risk considerations. The US export control environment is hardening. Australia’s cybersecurity and intelligence agencies are increasingly wary of data flows to both US and Chinese providers. Anthropic is US-based but its export control entanglement shows that “US AI” does not guarantee stable access. Zhipu is Chinese and publicly listed in Hong Kong, which introduces obligations under Chinese data laws. Sakana is Japanese and has the clearest geopolitical position for Australian users, but its multi-provider routing complicates data sovereignty. If your organisation operates in a regulated industry, map the data flow for each model before you adopt it.
Distillation and model safety. All three models face the risk that their capabilities will be extracted by adversaries through distillation attacks, as we saw with the Alibaba campaign against Claude. If you build on top of any of these models, understand that the safety guardrails on the original model can be stripped out in a distilled copy. Verify the provenance of any model you deploy, especially if it comes through an intermediary or aggregator.
The bottom line on security. For low-risk everyday coding and research tasks, any of these three models is fine. For sensitive, regulated or classified work, the only model that gives you full control is GLM 5.2 running on your own infrastructure. Everything else is a trust decision with the provider and the jurisdictions they operate in.
Related Reading
- Anthropic Accuses Alibaba of the Largest Known Attack on Claude AI
- Security Experts Warn Washington: Banning Anthropic AI Models Could Open the Door to Chinese Cyberattacks
- Amazon’s Anthropic Research Triggered White House Export Ban on AI Models
- AI Agents, Copilot and the New Security Risk: When Helpful Becomes Dangerous
