Loading...
Loading...
Leading AI labs and defenders are accelerating the use of powerful LLMs as security tools. OpenAI has begun offering a less-restricted GPT-5.5-Cyber to vetted cyber defenders for bug hunting, malware analysis and attack simulation, mirroring Anthropic’s limited Mythos program. Mozilla reports Mythos helped uncover 271 Firefox vulnerabilities with almost no false positives by pairing the model with a custom agent harness, build integration and a second LLM for verification. Separately, platforms enabling full-context access and GPT-5.5 Instant suggest longer context and persistent memory are shifting LLMs from episodic assistants to continuous agents, reshaping how organizations scale automated security discovery and incident response.
AI models are being repurposed as proactive security tools that can scale vulnerability discovery and incident response. Tech professionals must adapt toolchains, workflows, and access controls as models gain persistent context and higher privileges in security operations.
Dossier last updated: 2026-05-16 04:39:30
OpenAI is rolling out a less-restricted GPT-5.5 variant, dubbed GPT-5.5-Cyber or “Spud,” to vetted cyber defenders through its Trusted Access for Cyber program to help hunt bugs, analyze malware and simulate attacks. The model, reportedly close in capability to Anthropic’s Mythos, will allow approved defenders to generate proofs-of-concept, map attack surfaces and review patches while still blocking clearly malicious tasks like credential theft and malware authoring. The move comes amid tests showing GPT-5.5 can complete complex simulated attacks and a wider industry debate over safe rollouts; Anthropic has limited Mythos to about 40 organizations while OpenAI offers tiered access. The White House is considering policy responses.
Mozilla says Anthropic’s Mythos helped find 271 Firefox vulnerabilities in two months with “almost no false positives,” thanks largely to model improvements and a custom agent harness. Engineers built a harness that instructs the LLM, gives it tooling access (read/write files, run test builds), and loops until a deterministic success signal is reached — e.g., making Firefox’s sanitizer build crash to confirm memory-safety bugs. A second LLM grades results for added verification. Mozilla published detailed Bugzilla reports for a subset of the findings and provided test cases that meet its standards. The approach highlights how tightly integrated tooling and verification can make AI-assisted security discovery practical.
The post reports that a platform (linked in the original: https://t.co/KDffBvZtGU) has opened full-context access and integrated GPT-5.5 Instant within 48 hours, a change the author calls more than a model upgrade. They argue extended context gives AI continuity — effectively a form of long-term memory — enabling persistent, system-level reasoning across long codebases, financial simulations, on-chain analytics and multi-step tasks that stateless models struggle with. The piece frames parameter scaling as intelligence but claims context length and memory determine practical capability and workflow fit, suggesting this shift could transform AI from episodic tools into agents capable of sustained, coherent problem solving.
Mozilla says it used Anthropic’s Mythos and a custom agent “harness” to find 271 Firefox security vulnerabilities in two months, claiming almost no false positives. The harness integrates Mythos with Mozilla’s build, fuzzing systems and test pipeline, letting the model generate and iterate test cases until a sanitizer build confirms a crash; a second LLM grades results for additional verification. Mozilla published full Bugzilla reports and test cases for a subset of the findings, noting the improvements stem from both better models and project-specific tooling integration. The work underscores how LLMs combined with deterministic verification and engineering can scale vulnerability discovery while reducing human triage.