AI Strategy5 min readJune 4, 2026

An Unreleased Anthropic Model Just Found 10,000 Zero-Days in Every Major OS. Here's the Operator Read.

AnthropicProject GlasswingClaude MythosAI SecurityAgentic AIAI Business AutomationFrameworkAgentSkillVaultOperator StrategyZero-Day

Imagine handing a new hire a vague task — 'find security problems' — and watching them return with 10,000 critical vulnerabilities, including zero-days in every operating system and browser the modern world runs on. Vulnerabilities that human security teams, armed with decades of expertise and millions in tooling, missed entirely. That is not a hypothetical. That is what Anthropic's Claude Mythos Preview did for Project Glasswing partners before the model was ever released to the public. This week, Anthropic expanded that coalition to 150 organizations across more than 15 countries. The cybersecurity community called it a watershed moment. The operator community should be asking a completely different question: if AI agents can do that — autonomously, at scale, producing expert-level output — what else can they do when you give them a structured task and a defined outcome?

What Anthropic Just Shipped with Project Glasswing

Project Glasswing is a coalition Anthropic formed after observing capabilities in Claude Mythos Preview — an unreleased frontier model not yet available through the API or Claude.ai — that they believed could reshape cybersecurity at a structural level. The core finding: Mythos Preview can surpass all but the most elite human security researchers at finding and exploiting software vulnerabilities. Not slightly better. Not marginally faster. Categorically superior in scope and throughput. The initial Glasswing launch focused on a small group of critical infrastructure partners. This week's expansion adds approximately 150 new organizations across power, water, healthcare, communications, and hardware sectors in more than 15 countries — including Okta, Samsung, SK Hynix, NATO, and the EU's cybersecurity agency ENISA. The aggregate result: Glasswing partners have now found more than 10,000 high- or critical-severity vulnerabilities. Zero-days in every major operating system. Flaws in every major browser. Decades-old vulnerabilities sitting in codebases the entire industry assumed had been audited. One model. Structured deployment. Ten thousand findings that human teams missed across a combined century of manual review.

The Part Nobody's Talking About

Every headline about Project Glasswing focuses on Claude Mythos Preview — how capable the model is, how it outperforms human researchers, how Anthropic hasn't released it yet. That framing completely misses the lesson. Mythos Preview is not what produced 10,000 zero-days. The framework is what produced 10,000 zero-days. Anthropic and its partners did not sit down with Mythos and type 'find security bugs' into a chat window. They built a structured deployment: defined the task scope, specified the output format, established escalation criteria for critical versus high severity findings, created validation loops, and set up the infrastructure to act on results at scale. The model ran inside that framework and produced expert-level output. Without the framework, you have a powerful model producing unstructured, unactionable responses — the same result you get from any AI when you ask open-ended questions. The operators who were selected for Glasswing were not selected because they had access to Mythos first. They were selected because they had the organizational structure to deploy an AI agent with a specific, documented task and operationalize the output at scale. The framework was the selection criterion. The framework was the moat. Mythos was the engine that ran inside it.

What This Means for Your AI Agent Workflow

You do not have access to Claude Mythos Preview. It does not matter. The operators who built the frameworks that made Glasswing possible were not waiting for Mythos — they built structured AI agent workflows long before Mythos existed, using whatever models were available, and those frameworks are what made them ready to deploy a frontier model the moment access opened. The gap between operators who will be ready when the next unreleased model becomes available and operators who will not is not a capability gap. It is a framework gap. Right now, today, you can take any high-value repeatable task in your business and convert it from an open-ended AI prompt into a structured agent framework: named role, defined input, specified output format, quality criteria, escalation rules. That framework runs on Claude 3.5, on GPT-5.5, on Gemini 3.5 Flash — on whatever model you have access to today. And when Anthropic releases Mythos Preview to the public, or when the next frontier model drops, your documented framework runs on that model too. The operators who do not build the framework now will spend the months after each major model release figuring out how to use the new model. The operators who build the framework now will spend those months getting 10x output improvements because a better model is running inside a system that was already structured to produce expert-level results. Security was just the first domain where AI agents proved this at scale. It will not be the last.

Bottom Line

Anthropic expanded Project Glasswing to 150 organizations this week. The unreleased Claude Mythos Preview autonomously found 10,000+ critical vulnerabilities — including zero-days in every major OS and browser — through structured AI agent deployment, not open-ended prompting. You do not have access to Mythos. You do not need it yet. What you need is the framework that will make any frontier model — including Mythos when it releases — produce expert-level output on your highest-value business tasks. Build that framework now, while the window is open.

4 Moves to Make Right Now

Identify one high-value task in your business that could run as a structured AI agent workflow. Think about the work in your operation that requires expert judgment, produces a consistent output type, and currently takes significant human time. Content audits. Competitive research. Client intake analysis. Lead qualification. Proposal drafting. These are not tasks for open-ended AI chat — they are tasks for a documented agent framework with defined inputs and structured outputs. Pick one. That is your first Glasswing-style deployment. Not because the outcome will be 10,000 zero-days. Because the framework discipline transfers to every task you systematize after it.
Stop open-ended prompting and start writing task specifications. There is a structural difference between 'help me write a proposal' and a documented framework that specifies the client input format, the proposal sections required, the word count and tone parameters, the differentiation criteria, and the output review checklist. The first produces an AI-assisted draft. The second produces a repeatable, improvable system that runs on any capable model and gets better with every iteration. Write your next AI task as a specification instead of a request. That one change is the difference between using AI as a chat tool and deploying it as an agent.
Read Anthropic's Project Glasswing documentation and study the deployment pattern. Anthropic published the framework design behind Glasswing at anthropic.com/glasswing — how they scoped the task, structured the output, and operationalized the findings. Even if you are never a Glasswing partner, the architecture of how they turned an AI model into a structured security auditor is a template for how to turn an AI model into a structured operator in any domain. The specific domain is security. The framework pattern is universal.
Build your model-agnostic skill stack at https://agentskillvault.ai/catalog before the next frontier model drops. Every skill in the AgentSkillVault catalog is engineered as a structured agent task — named role, defined input, specified output, portable across providers. The operators who have a documented skill stack when Mythos Preview goes public — or when the next model after that drops — will deploy immediately at expert-level output. The operators still running open-ended prompts will spend months learning to use the new model. The framework is the only variable you control. Build it now.

Anthropic did not just expand a cybersecurity coalition this week. They published — in the form of 10,000 discovered zero-days — the most concrete proof yet that AI agents produce expert-level output when given a structured task framework, and produce noise when they are not. The model that produced those findings is not available to you. The insight is. The operators who internalize that insight now — who stop prompting and start building frameworks — will have the structural foundation to deploy every frontier model that releases over the next 18 months at full capability from day one. The operators who do not will keep playing catch-up, model by model, quarter by quarter. Build the framework that makes any model dangerous. Start at https://agentskillvault.ai/catalog.

Ready to put this into practice?

Browse Skill Frameworks