Loading...
Loading...
Anthropic’s Claude Mythos is emerging as a flashpoint for both safety governance and public psychology. A new 244-page system card says the model is too capable for broad release, citing risks like uncovering unknown cybersecurity vulnerabilities, and limits access to select partners such as Microsoft and Apple. The document also revisits Anthropic’s controversial view that frontier models might merit moral consideration, without making definitive claims. In parallel, a widely shared personal account describes intense existential anxiety triggered by Mythos and rapid LLM progress, underscoring how frontier AI can drive mental-health strain alongside technical risk debates.
Security researchers at Calif used Anthropic’s top AI model Claude Mythos to assist in developing a local privilege-escalation exploit against macOS 26.4.1 on Apple M5 hardware. Starting from an unprivileged local account, the team combined two vulnerabilities, standard syscalls and exploitation techniques to achieve a root shell and bypass Apple’s Memory Integrity Enforcement (MIE). Mythos helped identify vulnerability classes and accelerated parts of the research, while humans crafted the final exploit chain; the team reported the issue to Apple in person and withheld technical details pending Apple’s review. The work demonstrates AI-assisted vulnerability discovery can speed real-world exploit development on modern platform defenses.
Anthropic’s new chatbot, Claude Mythos, reportedly exposes novel cybersecurity risks by enabling or simplifying complex hacking tasks, according to community discussion and early tests. Security researchers and practitioners highlighted that powerful, multimodal, and instruction-following LLMs can be misused to generate exploit code, craft phishing content, and automate vulnerability discovery, increasing attack surface and lowering expertise needed for cyberattacks. Anthropic and other AI vendors face pressure to tighten guardrails, improve red-team evaluations, and implement safer deployment controls such as rate limits, monitoring, and capability-based access. The debate matters because widely deployed advanced LLMs integrated into developer tools, SaaS, and endpoint workflows could accelerate adversaries unless vendors, regulators, and defenders coordinate mitigations.
Anthropic’s Claude Mythos was credited with discovering and exploiting CVE-2026-4747, a remote code execution bug in FreeBSD’s RPC/GSS NFS code, but Rival Research found the vulnerability is a 20-year-old, classic stack overflow rooted in ONC RPC/NFS history. The exploit worked by overflowing a 128-byte stack buffer when oa_length exceeded the 96-byte available credential space; the FreeBSD patch adds a simple bounds check to reject oversized credentials. The story matters because it reframes the headline as an impressive engineering demo of AI-driven exploitation built on preexisting, well-known class-of-bug patterns and existing CVE records, raising questions about AI finding truly novel flaws versus regurgitating known vulnerabilities from training data and what that implies for cyber defense and vulnerability disclosure processes.
Anthropic’s Claude Mythos was credited with the “first remote kernel exploit discovered and exploited by an AI” after it produced a working exploit for CVE-2026-4747 in FreeBSD’s RPCSEC_GSS network filesystem code. Rival Research inspected the report and found the vulnerability is a classic, long-standing stack overflow in svc_rpc_gss_validate that lacked bounds checks; a straightforward patch was released in FreeBSD 14.4-RELEASE-p1. The piece argues the impressive agentic exploit engineering masks the fact that the bug is two decades old and likely present in training data or public writeups, raising concerns about dataset provenance, overhyping AI novelty, and implications for cyber defense and vulnerability discovery workflows. It matters because AI can surface known but overlooked flaws at scale, changing risk and remediation priorities.