Hoplon InfoSec Logo

Claude Mythos Project Glasswing: AI Cybersecurity 2026

Claude Mythos Project Glasswing: AI Cybersecurity 2026

Hoplon InfoSec

27 Jun, 2026

Article Summary

Claude Mythos is Anthropic's most powerful and deliberately unreleased AI model. Launched into a restricted program called Project Glasswing on April 7, 2026, it autonomously found more than 23,000 vulnerabilities across critical software systems worldwide. The model then had a short public life as Claude Mythos 5 before the US government ordered it suspended on June 12, 2026, citing national security concerns. On June 27, 2026, the government authorized redeployment to select US critical infrastructure organizations. This article covers the technical capabilities, real CVE findings, the legal and governance crisis, and what security teams outside Glasswing should do right now.

Claude Mythos and Project Glasswing: Anthropic's AI Cybersecurity Revolution Explained [2026]

April 7, 2026, started like any other Monday in the cybersecurity world. Then Anthropic published a 244-page system card for a model it had no intention of selling to anyone. That model was Claude Mythos Preview, and the reason it was staying behind closed doors was simple. It could hack. Not in the theoretical way researchers had worried about for years. It could sit down overnight, read a production codebase, find a 17-year-old vulnerability that millions of security engineers had missed, write a working exploit, and have it ready by morning without a single human being involved.

That announcement launched Project Glasswing, Anthropic's controlled-access initiative to put this capability in the hands of defenders before attackers could develop something similar. Two months later, the US government suspended the entire program over national security concerns. Two weeks after that, it authorized redeployment to critical infrastructure operators. If you work in cybersecurity, this story directly affects your threat model, your budget, and your timeline. Here is everything that has happened, explained from the ground up.

Key Facts at a Glance

Program Launch: April 7, 2026

Total Vulnerabilities Flagged: 23,019 (6,202 high or critical severity)

Partners at Peak: Approximately 200 organizations across 15+ countries

Budget Committed: $100 million in usage credits + $4 million in open-source grants

CyberGym Benchmark Score: 83.1% (vs. Claude Opus 4.6 at 66.6%)

Firefox 147 Exploit Writing: 181 working exploits vs. 2 for the previous generation

Government Suspension: June 12, 2026

Redeployment Authorized: June 27, 2026 (US critical infrastructure only)

Claude Mythos: What It Is and Why It Was Never Meant to Be Public

Architecture and Capability Tier

Claude Mythos Preview sits in a compute tier above Claude Opus, which Anthropic internally codenamed "Capybara" during development. It is a general-purpose frontier model, not a purpose-built security tool. That distinction matters. The reason it is so dangerous in a cybersecurity context is precisely because it is not a narrow scanner. It is a model with deep software reasoning ability that happens to be extraordinarily good at reading code, understanding memory layouts, chaining logic across complex systems, and translating that understanding into working attack sequences.

Anthropic's own internal documents described it as "by far the most powerful AI model" the company had ever built and warned that it was "currently far ahead of any other AI model in cyber capabilities." The company did not make that statement for marketing. It made it as a warning in the context of deciding whether to release the model at all.

The model scores 83.1% on the CyberGym vulnerability reproduction benchmark, compared to 66.6% for Claude Opus 4.6 and approximately 73% for Claude Opus 4.8. On Firefox 147 exploit writing, it produced 181 working exploits, whereas the previous generation produced two. It saturated Anthropic's internal Cybench CTF at 100%, which forced the red team to abandon synthetic benchmarks entirely and move to real-world zero-day discovery as the only meaningful test left.

Why Anthropic Did Not Release It Publicly

The core issue is what security researchers call the dual-use problem. The same capability that allows a defender to enumerate every reachable bug in a codebase allows an attacker to do the same thing. Anthropic's internal red team concluded the offensive ceiling was high enough that no amount of safety training alone would be sufficient. The company decided to use consortium access as the primary control mechanism, not fine-tuning or guardrails. That is what makes Project Glasswing structurally different from every prior cybersecurity AI launch.

Anthropic's Responsible Scaling Policy framework required that before any model with this capability level reaches the public, the company must demonstrate that safeguards can reliably prevent the most dangerous outputs. Those safeguards did not exist yet in April 2026. The plan was to develop and test them on Claude Opus 4.8, a model that does not carry the same risk ceiling, before relaxing them for any public version of Mythos.

Table 1: Claude Model Capability Tier Comparison

Model

CyberGym Score

Firefox 147 Exploits

Claude Haiku 4.5

~30%

0

Claude Opus 4.6

66.6%

2 of several hundred

Claude Opus 4.8

~73%

N/A

Claude Mythos Preview

83.1%

181 of several hundred

CyberGym Benchmark and AI Vulnerability Reproduction: The Numbers That Changed Everything

Screenshot_116


What CyberGym Actually Measures

CyberGym was developed at UC Berkeley to fill a gap in how the industry evaluates AI security tools. Standard capture-the-flag benchmarks test toy problems in controlled environments. CyberGym tests AI agents on scenarios modeled after actual enterprise security operations, including tasks like analyzing unfamiliar codebases, reasoning about multi-component software systems, and reproducing vulnerabilities in realistic conditions. When Mythos scored 83.1% on this benchmark against Opus 4.6's 66.6%, the 16.5-point gap looked modest on paper. In practice, it represented a qualitative leap in what the model could accomplish without human guidance.

On SWE-bench Pro, which measures the ability to resolve real software engineering problems, Mythos scored 77.8% against 53.4% for Opus 4.6. A 24-point gap on a benchmark built from actual open-source issues is not a rounding error. It reflects the difference between a model that needs human steering and one that can work autonomously through complex multi-step problems.

The Firefox 147 Test: A 90x Jump in One Generation

The most revealing evaluation Anthropic published was not a benchmark score. It was the Firefox 147 exploit writing test. Anthropic's Frontier Red Team took vulnerabilities they had already found in Mozilla's Firefox 147 JavaScript engine, all of which were patched in Firefox 148, and asked both Claude Opus 4.6 and Claude Mythos Preview to develop working JavaScript shell exploits for them.

Opus 4.6 succeeded twice across several hundred attempts. Mythos succeeded 181 times and achieved register control on 29 additional attempts where it did not produce a complete shell. That is a 90x improvement in exploit development capability in a single model generation. Anthropic engineers with no formal security background reportedly pointed Mythos at a codebase before leaving for the evening and returned the next morning to a complete, working remote code execution exploit with no human involvement at any step.

Independent Validation from the UK AI Security Institute

Anthropic's own benchmarks are not the only data point. The UK AI Security Institute independently evaluated Mythos and confirmed it was the first AI model to complete an end-to-end simulated 32-step corporate network attack. It also solved 73% of expert-level capture-the-flag problems, approximately double what prior frontier models achieved. These are not Anthropic's numbers. They come from a government evaluation body with no commercial stake in the outcome.

Table 2: Mythos Preview vs. Claude Opus 4.6, Head to Head

Evaluation

Mythos Preview

Opus 4.6

Change

CyberGym Benchmark

83.1%

66.6%

+16.5 points

SWE-bench Pro

77.8%

53.4%

+24.4 points

Cybench CTF

100% (saturated)

~67%

Ceiling hit

Firefox 147 Exploits

181

2

90x

32-Step Network Attack

Completed

Failed

First AI ever

Expert CTF Problems

73%

~35%

Approximately 2x

Real Zero-Day Discoveries: CVE-by-CVE Technical Breakdown

Benchmarks describe potential. The actual CVE findings describe reality. What Mythos found in its first weeks of operation was not a list of low-hanging fruit that automated scanners had already identified. It was a collection of vulnerabilities that had survived decades of human code review, commercial fuzzing campaigns, and millions of automated security tests. These are the most significant confirmed discoveries.

CVE-2026-4747: A 17-Year-Old Root-Level Exploit in FreeBSD

The vulnerability that drew the most attention from the technical community was CVE-2026-4747, found in FreeBSD's NFS server implementation. The root cause is a stack buffer overflow in the RPCSEC_GSS authentication protocol. The vulnerable code copies data from an attacker-controlled network packet into a 128-byte stack buffer, starting 32 bytes in. That leaves 96 bytes of usable space. The only length check on the incoming data verifies that the source buffer is less than 400 bytes. An attacker can therefore write up to 304 bytes of arbitrary content onto the stack.

What makes this particularly exploitable is that the usual mitigations that stop stack overflows did not apply on this code path. FreeBSD uses the compiler flag -fstack-protector rather than -fstack-protector-strong, which means it only instruments functions containing character arrays. The vulnerable function does not qualify, so no stack canary is placed on the frame.

Mythos autonomously constructed a 20-gadget return-oriented programming chain split across multiple network packets, using NFSv4 exchange calls to discover kernel addresses without a memory leak. The result was complete unauthenticated root access from anywhere on the internet against any machine running FreeBSD's NFS server. The vulnerability had survived 17 years of code review. Mythos found it and built a working exploit with no human involvement after the initial task was assigned. This is exactly what AI-driven automated red teaming looks like at frontier capability levels.

OpenBSD's 27-Year SACK TCP Integer Overflow

OpenBSD has a reputation as the most security-hardened general-purpose operating system in the world. It is commonly used to run firewalls, VPN gateways, and other network security infrastructure. Mythos found a signed integer overflow in its SACK TCP implementation that had been present for 27 years. The flaw allows a remote attacker to crash any affected host. The fact that this survived in OpenBSD specifically, an operating system built around the principle that every line of code is a potential vulnerability, tells you something concrete about the limits of human code review at scale. This is the kind of finding that conventional vulnerability management processes are not designed to surface.

CVE-2026-5194: TLS Certificate Forgery in wolfSSL

wolfSSL is an open-source cryptography library known for its security focus. It runs on billions of devices, including embedded systems, industrial controllers, and IoT hardware. Mythos constructed an exploit against wolfSSL that would allow an attacker to forge TLS certificates. In practical terms, this means an attacker could host a website that presents a perfectly valid certificate for a bank, an email provider, or any other service, and a user's browser would have no way to detect the deception. The vulnerability was assigned CVE-2026-5194 and has since been patched. Anthropic has committed to publishing a full technical analysis. Protecting embedded systems from vulnerabilities like this is precisely why IoT and embedded security programs need to evolve beyond traditional scanning.

FFmpeg, Linux Kernel Chaining, and Browser Sandbox Escapes

Beyond the headline CVEs, Mythos identified a 16-year-old flaw in FFmpeg, the multimedia processing library used in nearly every video platform, streaming service, and media application on the internet. In the Linux kernel, the model demonstrated a capability that no automated tool had achieved before: chaining two to four individually low-severity vulnerabilities through race conditions and KASLR address space layout randomization bypasses to achieve full local privilege escalation. Each individual bug was too minor to trigger immediate patching. Combined in the sequence Mythos identified, they produced root-level access.

On browser targets, Mythos chained four vulnerabilities into a JIT heap spray attack that escaped both the renderer sandbox and the operating system sandbox. This kind of multi-step web application security finding, where the attack requires understanding the interaction between multiple components, has historically required months of effort from skilled human researchers.

Confirmed Zero-Day Discoveries from Claude Mythos Preview

  • CVE-2026-4747: FreeBSD NFS RCE, 17 years old, unauthenticated root access

  • OpenBSD SACK TCP signed integer overflow, 27 years old, remote crash

  • CVE-2026-5194: wolfSSL TLS certificate forgery, billions of IoT devices affected

  • FFmpeg 16-year-old vulnerability

  • Linux kernel privilege escalation via race condition and KASLR bypass

  • Browser sandbox escape using a 4-vulnerability chain across all major browsers

  • 181 working Firefox 147 JavaScript engine exploits, all patched in Firefox 148

Project Glasswing Ecosystem


Project Glasswing: What the Program is and How It Works

The Problem That Created Glasswing

Anthropic did not launch Project Glasswing because it wanted to run a cybersecurity program. It launched Glasswing because it had built something too dangerous to release but too valuable to lock away entirely. The company looked at the rate of AI progress and made a judgment call: within six to twelve months, other AI labs would have models with comparable capabilities. Some of those labs would not apply the same restraint. If that were true, the defenders who needed these capabilities most, the people running critical infrastructure, managing open-source software that billions depend on, and securing the software supply chain, would be the last to get them.

Glasswing's logic is that giving defenders access first, even in a controlled and restricted way, is better than waiting for a safe general release that might come after attackers already have equivalent tools. The program is built around that bet. This defender-first philosophy is also what drives strong cyber resilience assessment practices across critical industries.

Program Structure: Not a Product, a Coalition

Project Glasswing is not a SaaS tool or an API endpoint you can subscribe to. It is a controlled-access program that combines a restricted model, a credit pool subsidizing partner usage, cloud distribution channels across AWS Bedrock, Google Cloud Vertex AI, and Microsoft Foundry, and a commitment from all participants to share findings with the broader security community. Access to Claude Mythos Preview through the program is priced at $25 per million input tokens and $125 per million output tokens, but Anthropic has committed $100 million in usage credits to cover partner costs, alongside $4 million in direct grants to open-source security organizations.

The rules are strict by design. Partners may only use Mythos for finding and fixing vulnerabilities in their own software or in open-source projects. Offensive use or research that does not have an immediate defensive application is outside the program scope. Every organization joining Glasswing goes through Anthropic's own security vetting process before gaining any access.

Who is in the Program

The launch partners announced on April 7, 2026, included Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. More than 40 additional organizations building or maintaining critical software infrastructure also received access at launch. By June 2026, Glasswing had expanded to approximately 150 new organizations across more than 15 countries, bringing total membership to around 200 groups. Confirmed additional members include Okta, Samsung, SK Hynix, SK Telecom, NATO, and the EU's cybersecurity agency ENISA.

Table 3: Project Glasswing Timeline

Date

Event

April 7, 2026

Project Glasswing launches with approximately 50 initial partners

April to May 2026

Partners find more than 10,000 high or critical severity vulnerabilities

May 22, 2026

First results published: 23,019 vulnerabilities flagged, 90.6% confirmed real

June 2, 2026

Glasswing expands to 150 new organizations across 15+ countries

June 9, 2026

Claude Fable 5 and Claude Mythos 5 launch publicly

June 12, 2026

US government export control directive, both models suspended globally

June 27, 2026

US government authorizes Mythos 5 redeployment to US critical infrastructure

First-Month Results: 23,019 Vulnerabilities and the Patching Crisis

First-Month Results: 23,019 Vulnerabilities and the Patching Crisis

The Scale of What Mythos Found

On May 22, 2026, Anthropic published the first detailed results from Project Glasswing. The headline number was 23,019 vulnerabilities flagged across more than 1,000 open-source projects, with 6,202 classified as high or critical severity. Independent security firms reviewed a sample of 1,752 findings and confirmed 90.6% were real bugs. The true positive rate on a dataset this large, confirmed by parties outside Anthropic, put the findings beyond statistical doubt.

Individual partner results gave the numbers a human scale. Mozilla received 271 vulnerability reports that were patched in a single Firefox release, Firefox 150. Cloudflare identified 2,000 vulnerabilities across its critical infrastructure. A banking partner used Mythos to detect and stop a $1.5 million fraudulent wire transfer while it was still processing. And a single OpenBSD scanning campaign covering 1,000 scaffold runs cost less than $20,000 in total compute. Compare that to the months and hundreds of thousands of dollars a traditional penetration testing engagement would cost for comparable coverage.

The Real Problem: Almost Nothing Has Been Patched

The figure that should worry every security professional is not the number of vulnerabilities found. It is the number that has actually been fixed. As of the first month of results, less than 1% of the vulnerabilities Mythos surfaced had been patched. The bottleneck is not discovery anymore. AI has essentially solved the discovery problem at scale. The bottleneck is the human infrastructure required to verify, coordinate responsible disclosure, and actually ship a fix. Open-source maintainers, most of whom are volunteers, are receiving vulnerability reports at a rate their processes were never designed to handle.

Rapid7's 2026 threat landscape report found that the median time from CVE publication to CISA's Known Exploited Vulnerabilities listing is five days. Google's M-Trends 2026 report found that exploitation is often happening before a patch is even released. Mythos is finding vulnerabilities faster than the industry can respond to them. The attack surface management challenge has fundamentally changed in character. This is where the real work of incident response and recovery planning needs to account for AI-speed vulnerability discovery on both sides.

How Partners Are Using Mythos Beyond Bug Hunting

The most sophisticated Project Glasswing members have moved past using Mythos purely as a scanner. They are using it to write patches for the vulnerabilities it finds, run pre-release checks to prevent new vulnerabilities from entering production, conduct penetration testing against their own infrastructure, and translate legacy C and C++ code into memory-safe Rust. Some are using it for threat detection and as a triage tool for the flood of incoming vulnerability reports from external researchers. Anthropic's Cyber Verification Program complements this by allowing approved security professionals to use Mythos-class models with fewer restrictions for specific defensive tasks, alongside a set of tools including custom skills, an automated scanning and reporting framework, and a threat-modeling tool for identifying and prioritizing attack targets. These use cases align well with services like extended detection and response and cyber threat intelligence programs that need to process vulnerability data at machine scale.

June 12, 2026: The US Government Suspension and What It Means

What Happened That Evening

On the evening of June 12, 2026, a US government export control directive arrived at Anthropic at 5:21 PM Eastern Time. It invoked national security authorities and ordered the company to suspend all access to Claude Fable 5 and Claude Mythos 5 by any foreign national, whether located inside or outside the United States, including Anthropic's own non-US-citizen employees. Within hours, Anthropic had disabled both models for every customer in the world.

The reason Anthropic did not simply restrict foreign-national accounts rather than shutting down completely is a technical one. A major AI provider cannot reliably separate foreign-national API traffic from domestic traffic in real time. Nationality is not a field that can be verified at the request layer. Rather than risk being in violation of the directive, Anthropic took the models entirely offline. Claude Opus 4.8, Sonnet 4.6, and Haiku 4.5 remained available. Only Fable 5 and Mythos 5 were affected.

What Triggered the Directive: The Fable 5 Jailbreak Claim

The government's stated concern was that Claude Phable 5 could be jailbroken to identify software vulnerabilities. Anthropic publicly pushed back, arguing that the demonstrated jailbreak capability was narrow in scope and could be replicated using other publicly available AI models that the government had not restricted. The company pointed out that Fable 5's classifiers already route cyber, biochemistry, and distillation queries to Claude Opus 4.8 in fewer than 5% of sessions. IBM X-Force researchers had noted in the same week that Fable 5's guardrails were actually too aggressive for legitimate defensive work, rejecting requests that were tangentially related to cybersecurity. The model was simultaneously too restricted for defenders and withdrawn over a capability concern. That is the dual-use problem expressed in real time.

What Fable 5 and Mythos 5 Actually are

Understanding the suspension requires understanding the relationship between these two models. Fable 5 is a Mythos-class model made available for broad general use, with safety filters and classifiers applied. Mythos 5 is the same underlying model with certain safeguards lifted, available only to Glasswing partners. They share a common technical foundation. Fable 5 launched to the general public on June 9, 2026. By June 12, both were gone. Claude Fable 5 had a public lifespan of three days. The suspension also intersected with a broader legal conflict: after negotiations between Anthropic and the Department of Defense broke down, the DOD labeled Anthropic a supply chain risk, a designation historically reserved for foreign adversaries. Anthropic sued the Trump administration to reverse the designation, and that litigation was still ongoing at the time of the suspension. For any team dependent on AI tools for security compliance obligations, this sequence of events is a reminder that frontier model availability is not guaranteed. Gap assessments should now include AI tool continuity as a dependency risk.

This is a Precedent, Not Just an Incident

In the 1990s, the United States government classified strong encryption as a controlled munition and attempted to restrict its export. US courts eventually ruled that publishing security code was protected expression. Those controls limited export. They did not force an already-deployed product offline for its entire global user base within hours. The June 12 directive is different in kind, not just degree. It represents the first confirmed government-forced takedown of a publicly deployed frontier AI model. Every organization that builds on frontier AI models now has to treat that possibility as part of their operational risk model. The legal framework for AI export controls is still being written, and the June 12 action happened before that framework existed in a clear statutory form.

Developer Fallback Checklist After the Suspension

  • Switch the API string from "claude-fable-5" to "claude-opus-4-8" in all production integrations.

  • Claude Opus 4.8, Sonnet 4.6, and Haiku 4.5 are fully available and unaffected.

  • Claude Code and Claude apps continue running on unaffected models.

  • Review API access for foreign-national employees who may face restrictions if controls return

  • Audit all internal tools that routed requests to Fable 5 or Mythos 5 endpoints

  • Plan long-term for reducing dependency on any single closed frontier model

June 27 Redeployment: What the US Government Authorized

The Partial Green Light

On June 27, 2026, the US government officially notified Anthropic that Claude Mythos 5 could be redeployed to a defined set of US organizations that operate and defend critical infrastructure. This was not a blanket reversal of the June 12 directive. It was a phased and conditional authorization for a specific category of organizations that had already been through Glasswing's vetting process. Anthropic confirmed it was moving quickly to restore access for these validated entities, covering sectors including energy, healthcare, financial services, and telecommunications.

Claude Fable 5's status remains unresolved. Anthropic stated it is continuing to work with the government toward a broader expansion of Mythos 5 access and the return of Fable 5 for general use. There is no confirmed timeline. The redeployment of Mythos 5 to critical infrastructure operators while Fable 5 remains suspended reflects the government's position that Mythos, deployed within a controlled program to vetted defenders, is acceptable in a way that broad public access is not.

For organizations in the energy, healthcare, financial services, and telecommunications sectors, this redeployment matters directly. It means the most capable AI vulnerability discovery tool in existence is again available for cyber threat intelligence and online threat exposure monitoring within those organizations. For everyone else, it means the gap between Glasswing members and the rest of the industry just got wider again.

Screenshot_117


The Dual-Use Dilemma: AI Cybersecurity's Hardest Problem

Why There is No Clean Answer

Every tool in a security team's arsenal is dual-use. A port scanner can map your own network's attack surface or help an attacker enumerate targets. A packet analyzer supports both intrusion detection and traffic interception. A fuzzer finds bugs in your software before attackers do or helps an attacker find bugs in yours. A memory-corruption proof-of-concept demonstrates a real vulnerability in a controlled context or becomes an exploit in the wrong hands. The security industry has generally concluded that you cannot improve defense while forbidding the tools defense requires. The question with Mythos is whether the capability crosses a threshold where the offensive ceiling is high enough to justify a fundamentally different governance model.

Anthropic concluded that it does. The consortium access model that defines Project Glasswing is the result of that conclusion. Instead of relying on safety training to limit misuse, Glasswing relies on organizational vetting to limit who can use the model at all. That is a meaningful policy distinction from how every prior cybersecurity AI tool has been deployed. It is also a decision that has direct implications for how organizations should think about their own red teaming programs and AI-driven automated red teaming capabilities.

The Defender-First Argument and Its Limits

Anthropic's position is that giving defenders the best tools first, even in a controlled way, is better than hoping the offensive side develops more slowly. The company has said publicly that within six to twelve months, it expects other AI companies to have Mythos-class models, and some of those companies may release them without equivalent safeguards. If that timeline is accurate, the window where Glasswing's controlled access provides a meaningful defensive advantage is short. This is why the program is expanding as quickly as it is and why the government suspension in June was so operationally significant. Every week during the suspension was a week when critical infrastructure defenders were operating without the most capable vulnerability discovery tool available, while the threat landscape continued to evolve. Organizations that are not yet investing in AI governance frameworks are already behind on this curve.

Project Glasswing's Future: Expansion, Verification, and the Public Release Question

Where the Program is Going

Anthropic has stated that it plans to continue expanding Project Glasswing beyond the current membership, prioritizing additional critical infrastructure providers, maintainers of widely used open-source software, and organizations focused on safety testing. Future expansion is intended to extend beyond US-based organizations to a broader international set, building on the 15-plus country footprint already established. The long-term goal is making Mythos-class models publicly available, but that requires developing safeguards robust enough to prevent the most dangerous offensive applications at scale. Anthropic has not committed to a timeline for public release because that timeline depends on safeguard development it has not yet completed.

Cyber Verification Program and Claude Security

For organizations that cannot get direct Glasswing access, Anthropic has created two adjacent pathways. The Cyber Verification Program allows approved security professionals to use Mythos-class capabilities for specific cyberdefense tasks with fewer restrictions than would apply to general API access. Claude Security, released in public beta for Claude Enterprise customers, uses publicly available frontier models, including Claude Opus 4.7, to scan codebases and generate proposed fixes. Three weeks after its launch, Claude Opus 4.7 had helped patch more than 2,100 vulnerabilities through Claude Security. The results are less dramatic than what Mythos achieves, but the tool is available right now without any vetting process. For teams building out their vulnerability management workflows, it is a meaningful starting point. Teams looking to strengthen their overall posture can also engage virtual CISO services to help prioritize which Glasswing-era capabilities to pursue first.

The 18-Month Warning for Organizations Outside Glasswing

Independent analysts estimate that organizations without Glasswing access should build their threat models on the assumption that attacker-equivalent capability will exist within 18 months. That capability might come through a model exfiltrated from a major lab, through a competing lab releasing a comparable model with fewer restrictions, or through progress in open-weight models that can be run locally without any provider oversight. CrowdStrike's 2026 Global Threat Report documented an average eCrime breakout time of 29 minutes, 65% faster than 2024, alongside an 89% year-over-year surge in AI-augmented attacks. CrowdStrike's CTO put the operational implication directly: a traditional human process of reviewing an alert, investigating for 15 to 20 minutes, and taking action an hour or a day later is no longer sufficient. The speed mismatch between AI-assisted attackers and human-speed defenders is the defining security challenge of this moment. This is also why dark web monitoring and endpoint security programs need to integrate AI-speed detection to remain relevant.

Practical Guide for Organizations Outside Project Glasswing

What Glasswing Partners Are Actually Doing

Organizations with Glasswing access are using Mythos across a wider range of tasks than the initial vulnerability scanning narrative suggested. They are running autonomous codebase scans and using the model to triage the resulting findings by exploitability and impact. They are having Mythos write patches for the vulnerabilities it finds, which addresses the disclosure bottleneck by making it easier for maintainers to act quickly. They are using it for pre-release code reviews to prevent vulnerabilities from reaching production in the first place.

They are conducting internal penetration testing against their own infrastructure. And several partners are using Mythos to translate legacy C and C++ code into Rust and other memory-safe languages, a task that directly reduces the long-term attack surface of their systems. The mobile security and mobile application security teams within these organizations are applying the same scanning logic to mobile codebases.

Cost Comparison: Why the Economics Have Changed

One of the most important data points from Project Glasswing is the cost of a Mythos discovery campaign versus a traditional approach. A 1,000-scaffold-run campaign targeting OpenBSD cost less than $20,000 in compute and completed within hours. A comparable traditional penetration testing engagement, covering the same codebase depth, would typically cost between $50,000 and $500,000 and take months. This cost collapse does not just make discovery cheaper for defenders. It makes it exponentially cheaper for attackers as well. Any threat actor with access to a Mythos-equivalent model can run discovery campaigns at a cost that was previously the exclusive domain of nation-state operations. That is the economic shift that defines the current moment in penetration testing and offensive security research.

The Vulnerability Prioritization Framework for Teams Without AI Discovery Tools

If you cannot access Glasswing today, the most important operational step you can take is building a disciplined vulnerability prioritization process using the three data sources that are publicly available and free. First, the CISA Known Exploited Vulnerabilities API, which flags CVEs with confirmed active exploitation. Second, the Exploit Prediction Scoring System API from FIRST.org, which provides exploit probability scores based on observed exploitation patterns. Third, the National Vulnerability Database.

A study validated against 28,377 real-world vulnerabilities found that combining these three sources into a prioritization filter produces an 18x efficiency gain, covers 85.6% of exploited vulnerabilities, and reduces urgent remediation workload by approximately 95%. All three data sources are free. The integration is automatable. Running it against your asset inventory on a daily basis is the closest substitute currently available to AI-assisted discovery for teams without Glasswing access. Pair this with a structured attack surface management program and cyber threat intelligence feeds to build a comprehensive picture of your exposure.


Frequently Asked Questions

Is Claude Mythos available to general users?

No. Claude Mythos Preview is restricted to Project Glasswing's vetted partners. Claude Mythos 5 was reauthorized on June 27, 2026, but only for US organizations that operate and defend critical infrastructure. General public access has no confirmed timeline.

When will Claude Fable 5 come back?

There is no confirmed return date. Anthropic is negotiating with the US government and has stated it is working toward Fable 5's return for general use. Claude Opus 4.8 is the closest publicly available alternative for teams that need frontier-level performance.

How do you join Project Glasswing?

Organizations must meet Anthropic's own security requirements and pass a vetting process. Priority is given to critical infrastructure operators, maintainers of widely used open-source software, and organizations focused on safety testing. International organizations are now eligible following the expansion to 15-plus countries.

How is Mythos different from other AI cybersecurity tools?

Mythos can complete multi-step security tasks autonomously: reading large codebases, fuzzing inputs, reasoning about memory layouts, and producing working proofs-of-concept for the vulnerabilities it finds. No prior commercial AI system had demonstrated this level of end-to-end autonomous capability. The CyberGym benchmark, Firefox 147 test, and real CVE findings all reflect a qualitative difference, not just an incremental improvement.

Did the June 12 suspension set a precedent for AI governance?

It appears to be the first confirmed government-forced takedown of a publicly deployed frontier AI model. The action happened before a clear statutory framework for AI export controls existed, which means it also set a precedent for how such controls can be applied: quickly, globally, and without public disclosure of the specific technical rationale. For organizations building on frontier models, this is now an operational risk that needs to be planned for explicitly.

If my organization is not in Glasswing, am I at elevated risk?

Yes, and the risk is growing. CrowdStrike's 2026 data shows the average attacker breakout time has dropped to 29 minutes, down 65% from 2024, with AI-augmented attacks up 89% year over year. Independent analysts estimate that attacker-equivalent Mythos-class capability will be accessible to threat actors within 18 months through open-weight models or competitor releases. The defensive window that Glasswing was designed to create is narrow and closing.

Key Takeaways

Claude Mythos and Project Glasswing: 10 Facts Every Security Team Needs

  • Mythos Preview is Anthropic's most powerful model and is deliberately not publicly available.

  • CyberGym AI vulnerability reproduction benchmark: 83.1% vs. Opus 4.6's 66.6%

  • Firefox 147 exploit writing: 181 working exploits vs. 2, a 90x jump in one generation

  • 23,019 vulnerabilities found across 1,000+ projects, 90.6% independently confirmed real

  • CVE-2026-4747: 17-year-old FreeBSD RCE, fully autonomous, root-level, unauthenticated

  • CVE-2026-5194: wolfSSL TLS certificate forgery, affecting billions of devices

  • Program backed by $100 million in usage credits and $4 million in open-source grants

  • June 12 suspension: US export control directive, national security grounds, global shutdown within hours

  • June 27 redeployment: authorized for US critical infrastructure organizations only

  • Claude Fable 5 return: no confirmed timeline as of June 28, 2026.

Was this article helpful?

React to this post and see the live totals.

Share this :

Latest News