Loading...
Loading...
DeepMind’s AlphaProof Nexus pairs large language models with the Lean formal verifier to autonomously solve research-level math problems, including 9 open Erdős problems and 44 OEIS conjectures, at modest compute cost. Using iterative agentic loops—LLM-generated proof drafts checked and refined by a verifier—the system turned competition-grade reasoning into dependable, machine-checked theorems. Different agent configurations, from simple LLM+Lean setups to evolutionary proof drafting, showed that even basic configurations can succeed, highlighting stronger base-model capabilities and the value of verifier feedback. Beyond mathematics, Nexus signals practical advances for formal verification in smart contracts, zero-knowledge proofs, and cryptographic protocol assurance.
AlphaProof Nexus demonstrates that coupling LLMs with formal verifiers can turn high-level reasoning into machine-checked proofs, lowering barriers to reliable automated verification. This matters for tech professionals because the same techniques can improve assurance in smart contracts, zero-knowledge systems, and cryptographic protocols.
Dossier last updated: 2026-05-29 18:20:27
Researchers introduce ATLAS, a project to autoformalize large textbook corpora into machine-checkable proofs and formal libraries using AI. The paper “Formalizing Mathematics at Scale” proposes pipelines combining LLM-driven translation, interactive theorem provers, and verification tooling to convert informal mathematics into formal languages, aiming to scale formalization across domains and lower the manual burden on proof engineers. Key players include the paper’s authors and the broader proof-assistant and LLM ecosystem; the work ties into systems like Lean, Coq and modern large language models. This matters because scalable autoformalization could accelerate trustworthy mathematical knowledge, improve software verification, and create high-quality training data for reasoning-focused AI.
Google DeepMind’s new AI framework AlphaProof Nexus combined large language models with Lean formal verification to autonomously solve 9 open Erdős problems out of 353, including two that had been unresolved for 56 years. The system also proved 44 conjectures in OEIS, solved a 15-year-old Hilbert function problem, and improved bounds in convex optimization, with per-problem costs of a few hundred dollars. AlphaProof Nexus uses four agents of increasing complexity (Agent A using Gemini 3.1 Pro + Lean, Agent B integrating AlphaProof, Agent C adding evolutionary proof drafting, and Agent D combining all features). Researchers found even the simplest agent could solve the nine cases, underscoring stronger base-model capabilities and the anchoring effect of compiler feedback on LLM reasoning. This advances automated formal math and tool-assisted theorem proving.
Google DeepMind’s AlphaProof Nexus, which pairs large language models with the Lean formal proof assistant, has autonomously solved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures, with each problem costing only a few hundred dollars in compute. The system uses iterative “agentic loops”: the LLM drafts proofs and a formal verifier checks every step, rejecting and refining failed attempts; successful proofs and selected natural-language explanations are published on arXiv and GitHub. The results mark a leap from previous LLM-driven Olympiad-level solvers to research-grade theorem proving, showing cross-domain capability in combinatorics and number theory. Beyond pure math, the approach promises implications for AI-driven formal verification in areas like smart-contract auditing, zero-knowledge proofs, and cryptographic protocol validation.
Google DeepMind’s AlphaProof Nexus, which couples large language models with the Lean formal proof assistant, has autonomously solved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures, according to a May 21, 2026 arXiv preprint and accompanying GitHub release. The system uses iterative “agentic loops” where an AI proposes proofs and a separate verifier checks each logical step, rejecting invalid attempts; successful proofs cost only a few hundred dollars each. Nexus extends DeepMind’s prior AlphaProof work (noted for math-competition performance) into research-level theorem proving, showing cross-domain capability in combinatorics and number theory. The approach reduces AI hallucination risk and has clear implications for formal verification tasks relevant to smart-contract auditing, zero-knowledge proofs, and cryptographic protocol assurance.