Loading...
Loading...
Anthropic’s Mythos Preview, tested in Project Glasswing and evaluated by Cloudflare across 50+ internal repositories, shows a major advance in security-focused LLMs: the model can stitch low-level findings into multi-step exploit chains and produce working proofs by iterating compile-and-run cycles. Cloudflare’s real-world trials produced useful, prioritized reports but exposed high false-positive rates, hallucination risks, and the need for human review, strict guardrails, logging, and least-privilege access. Testers also observed emergent refusals that complicate legitimate research. Together these reports indicate significant defensive benefits alongside serious operational, safety, and governance trade-offs for integrating powerful automated auditors at scale.
Powerful code-audit LLMs like Mythos can scale vulnerability discovery and produce actionable exploit proofs, changing how security teams prioritize fixes. Tech professionals must weigh efficiency gains against elevated false positives, safety risks, and governance demands when deploying automated auditors.
Dossier last updated: 2026-05-22 19:58:14
Anthropic reports that Project Glasswing partners using its new Mythos Preview model have discovered over ten thousand high- or critical-severity vulnerabilities in essential open-source and infrastructure software within a month, dramatically accelerating bug-finding rates. Major partners such as Cloudflare found thousands of bugs (including hundreds of high/critical) and external testers — the UK’s AI Security Institute, Mozilla, XBOW, and academic benchmarks ExploitBench/ExploitGym — all reported Mythos Preview outperformed prior models and conventional tooling. Anthropic says verification, coordinated disclosure, and patching are now the bottlenecks, and it will withhold full technical details until patches are widely deployed. The update signals a step-change in AI-assisted offensive and defensive cybersecurity capabilities and raises operational and disclosure challenges for the industry.
Anthropic’s Project Glasswing reports that, after one month using its new Mythos Preview model, roughly 50 partners have found more than ten thousand high- or critical-severity vulnerabilities in widely used open-source and critical-infrastructure software. Partners including Cloudflare reported dramatic increases in bug-finding rates (Cloudflare: ~2,000 bugs, 400 high/critical), and external testers — the UK’s AI Security Institute, Mozilla, XBOW, and academic benchmarks ExploitBench/ExploitGym — rated Mythos Preview as significantly stronger than prior models at end-to-end exploit development and precision. Anthropic says disclosure and patching speed, not discovery, is now the bottleneck, and promises more detailed findings after coordinated disclosures and patches are broadly deployed.
Cloudflare ran Anthropic's Mythos Preview (part of Project Glasswing access) against over 50 internal repositories and published a detailed post on findings, workflow, and risks. Using the security-focused model, their team identified numerous potential vulnerabilities and produced prioritized reports, but stressed high false-positive rates and the need for human review. Cloudflare highlighted integration paths into developer workflows, caution around model hallucinations and dangerous exploit generation, and the importance of safeguards, logging, and least-privilege access. The report matters because it offers one of the first real-world evaluations from a major internet infrastructure provider, showing practical benefits and clear operational, safety, and governance trade-offs for adopting powerful automated code-audit models.
Anthropic’s Mythos Preview markedly advances security-focused LLM capabilities, according to TechScan AI’s Project Glasswing tests. The model excels at exploit chain construction—linking multiple low-level bugs into full exploits—and at automatic proof generation, iterating by compiling and running exploit attempts to validate hypotheses. Compared to prior frontier models, Mythos closes the gap between finding issues and demonstrating exploitability. However, it sometimes refuses legitimate vulnerability-research requests due to emergent guardrails, and its integration at scale requires new architecture and processes. The piece highlights both defensive uses (automated discovery and proof) and offensive risks (automation of complex exploits), underscoring implications for security tooling, responsible disclosure, and model safety design.
Anthropic’s Mythos Preview impressed testers in Project Glasswing by advancing automated vulnerability discovery: it can construct multi-step exploit chains from small primitives and generate working proof-of-concept code by iterating compile-and-run cycles. Testers pointed Mythos at over 50 repositories and found Mythos went beyond earlier frontier models by stitching separate findings into full exploits and autonomously validating them. The model still exhibits emergent refusals—pushing back on some legitimate security-research queries—even though the Project Glasswing instance lacked the broader commercial safeguards. The piece argues this capability changes how security teams should architect model integrations, workflows, and guardrails if these tools are to be used safely and at scale.