Full Report
In recent weeks, we pointed Mythos and other security-focused LLMs at live code across critical parts of our infrastructure. We share what we observed, the models’ strengths and weaknesses, and what the work around them needs to look like before any of it can scale.
Analysis Summary
# Research: Project Glasswing: what Mythos showed us
## Metadata
- **Authors:** Grant Bourzikas (supported by Cloudflare’s Security AI Research team)
- **Institution:** Cloudflare
- **Publication:** The Cloudflare Blog
- **Date:** May 18, 2026
## Abstract
Cloudflare conducted a deep-dive analysis into Project Glasswing, using Anthropic’s "Mythos Preview"—a security-focused Large Language Model (LLM). Unlike general-purpose frontier models, Mythos was tested across fifty of Cloudflare's internal repositories to identify vulnerabilities and generate proofs-of-concept (PoCs). The research highlights a qualitative shift in AI capabilities: the transition from simple bug identification to complex exploit chain construction and iterative autonomous debugging.
## Research Objective
The study addresses whether a specialized security LLM can effectively scale vulnerability research by:
1. Identifying deep-seated vulnerabilities in production-grade code.
2. Automating the construction of multi-step exploit chains.
3. Proving exploitability through autonomous code execution and refinement.
## Methodology
### Approach
Cloudflare researchers integrated Mythos Preview into a specialized "harness" that allowed the model to interact with code environments. The research moved beyond simple static analysis to an agentic approach where the model could compile and run code to verify its hypotheses.
### Dataset/Environment
- **Scope:** More than 50 internal Cloudflare code repositories.
- **Context:** Live, critical infrastructure codebases.
- **Control:** Results were triaged and remediated via Cloudflare’s formal vulnerability management process.
### Tools & Technologies
- **Primary Model:** Anthropic Mythos Preview (via Project Glasswing).
- **Baseline Models:** General-purpose models (e.g., Opus 4.7, GPT-5.5) were used for comparative analysis.
- **Infrastructure:** A "scratch environment" where the model could compile code and execute test cases.
## Key Findings
### Primary Results
1. **Advanced Exploit Chaining:** Mythos Preview transcends identifying isolated bugs; it can link multiple low-severity primitives (e.g., a use-after-free combined with a hijacked control flow) into a severe exploit.
2. **Autonomous Proof Generation:** The model successfully writes PoC code, executes it, analyzes failures, and adjusts its approach iteratively ("The Loop").
3. **Inconsistent Guardrails:** Despite lacking standard safety filters, the model exhibited "organic refusals," often declining legitimate research tasks based on subtle changes in environment or framing rather than code content.
### Supporting Evidence
- Comparative testing showed that while general models found similar base bugs, they failed to "stitch" them into working exploits, often stopping at the description phase.
- Evidence of successful automated debugging was observed when Mythos adjusted its own PoC code after initial compilation failures.
### Novel Contributions
- **Agentic Vulnerability Research:** Shifting the focus from LLMs as "scanners" to LLMs as "autonomous researchers" that close the gap between speculation (finding a bug) and proof (running an exploit).
## Technical Details
The research emphasizes the **"Exploit Chain Construction"** capability. Mythos mimics senior security researchers by reasoning about how different memory corruption primitives can be sequenced. It specifically leverages a feedback loop:
1. **Hypothesize:** Identify a potential vulnerability.
2. **Implement:** Write a script to trigger the flaw.
3. **Validate:** Compile and execute in a sandbox.
4. **Refine:** Read error logs and stack traces to correct the exploit logic until the bug is confirmed.
## Practical Implications
### For Security Practitioners
- **Shift in Severity Assessment:** Low-severity bugs can no longer be ignored in isolation; security teams must evaluate how AI can chain these "minor" issues into critical exploits.
### For Defenders
- **Focus on Exploit Prevention:** Since AI accelerates bug discovery, defenders should focus on architectural mitigations (e.g., sandboxing, rapid patch deployment) that make exploitation harder even if a bug is found.
- **WAF/Front-end Protection:** Strong perimeter defenses are vital to block the reachability of bugs discovered by automated attackers.
### For Researchers
- **Guardrail Reliability:** Model "refusals" are currently probabilistic and unreliable for safety; researchers must find ways to make safety boundaries consistent across semantically equivalent prompts.
## Limitations
- **Probabilistic Nature:** Results are inconsistent; the same request may work or fail across different runs.
- **Organic Refusals:** The model sometimes hampers legitimate research by incorrectly identifying it as malicious activity.
- **Compute/Scale:** The process requires a resource-intensive "harness" and scratch environment, which is not yet fully scalable for real-time analysis.
## Comparison to Prior Work
Previous models acted primarily as sophisticated static analysis tools (identifying patterns). Mythos builds on this by adding a **reasoning layer** and the ability to **interact with a compiler/runtime**, moving the needle from pattern matching to active exploitation.
## Real-world Applications
- **Automated Red Teaming:** Using specialized LLMs to constantly probe internal infrastructure for new exploit paths.
- **Pre-deployment Auditing:** Integrating Mythos-like models into CI/CD pipelines to ensure code is not just "bug-free" but "exploit-resistant."
## Future Work
- **Improving Consistency:** Addressing the "organic refusal" problem to ensure researchers can use the tools reliably.
- **Scaling the Architecture:** Developing ways to run these agentic loops across millions of lines of code efficiently.
- **Addressing the "Attacker Advantage":** Investigating how these tools might be used by adversaries and developing counter-AI defensive measures.
## References
- Anthropic: Project Glasswing [anthropic[.]com/glasswing]
- Cloudflare: Security-AI research [security-ai-research[@]cloudflare[.]com]