Training LLMs on Formal Proofs like Lean Develops Verifiable Long-Horizon Reasoning

Related Insights

Generative AI Discovers Mathematical Proofs by Generalizing Patterns from Past Proofs

Generative AI can produce the "miraculous" insights needed for formal proofs, like finding an inductive invariant, which traditionally required a PhD. It achieves this by training on vast libraries of existing mathematical proofs and generalizing their underlying patterns, effectively automating the creative leap needed for verification.

The Great Security Update: AI ∧ Formal Methods with Kathleen Fisher of RAND & Byron Cook of AWS

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·6 months ago

Training on Code Teaches AI Models Hierarchical Reasoning, Not Just Programming

The structured, hierarchical nature of code (functions, libraries) provides a powerful training signal for AI models. This helps them infer structural cues applicable to broader reasoning and planning tasks, far beyond just code generation.

AI's Research Frontier: Memory, World Models, & Planning — With Joelle Pineau

Big Technology Podcast·5 months ago

Formal Math Languages Like Lean Turn Theorem Proving Into a Solvable Game for AI

Languages like Lean allow mathematical proofs to be automatically verified. This provides a perfect, binary reward signal (correct/incorrect) for a reinforcement learning agent. It transforms the abstract art of mathematics into a well-defined environment, much like a game of Go, that an AI can be trained to master.

Adam Marblestone – AI is missing something fundamental about the brain

Dwarkesh Podcast·6 months ago

Amazon's ARC Uses Multiple LLM Translations and a Theorem Prover to Formalize Policies

To reliably translate a natural language policy into formal logic, Amazon's system generates multiple translations using an LLM. It then employs a theorem prover to verify these translations are logically equivalent. Mismatches trigger a clarification loop with the user, ensuring the final specification is correct before checking an agent's work.

The Great Security Update: AI ∧ Formal Methods with Kathleen Fisher of RAND & Byron Cook of AWS

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·6 months ago

Axiom Argues Math Superintelligence Is the Critical Path to Verifying General AI

The purpose of creating a superhuman mathematician is not just to solve proofs, but to establish a system of verifiable reasoning. This formal verification capability will be essential to ensure the safety, reliability, and collaborative potential of all future AI code and superintelligence.

AI vs. Dog Cancer, Oscars Reactions, How to Lose the AI Arms Race | Kevin Espiritu, Paul Conyngham, Tony Zhao, Drew Oetting, Carina Hong, Cameron Fink, Debra Birnbaum

TBPN·3 months ago

LLMs Master Code Before Math Because GitHub Data Reveals Reasoning, Unlike Math Papers

LLMs excel at coding because internet data (e.g., GitHub) provides complete source code, dependencies, and reasoning. In contrast, mathematical texts online are often just condensed summaries or final proofs, lacking the step-by-step process. This makes it harder for models to learn mathematical reasoning from pre-training alone.

Vlad Tenev and Tudor Achim on mathematical superintelligence, why math is harder than code for LLMs, and the end of buggy software

Summation (formerly World of DaaS)·3 months ago

LLMs Solve Strategy Games By Excelling at Narrative Reasoning, Not Brute-Force Calculation

Large Language Models are uniquely suited for complex strategy games like Civilization. Their strength lies not in calculation, where traditional AI excels, but in maintaining long-term narrative consistency and strategic coherence, which is the actual bottleneck for game mastery.

The Game AI Problem Computers Were Never Built to Solve

Machine Learning Tech Brief By HackerNoon·5 months ago

Imbue LLMs with Reasoning by Training on Code and Textbooks

To improve LLM reasoning, researchers feed them data that inherently contains structured logic. Training on computer code was an early breakthrough, as it teaches patterns of reasoning far beyond coding itself. Textbooks are another key source for building smaller, effective models.

Best of the Pod: Reid Hoffman on How AI Is Answering Our Biggest Questions

AI & I·6 months ago

AI's Math Breakthrough Required Formal Verification to Overcome the Trust Gap

Simply generating a mathematical proof in natural language is useless because it could be thousands of pages long and contain subtle errors. The pivotal innovation was combining AI reasoning with formal verification. This ensures the output is provably correct and usable, solving the critical problems of trust and utility for complex, AI-generated work.

Vlad Tenev and Tudor Achim on mathematical superintelligence, why math is harder than code for LLMs, and the end of buggy software

Summation (formerly World of DaaS)·3 months ago

Advancing AI for Math Requires a New Formal Language for Strategy and Plausibility

We have formal languages like Lean for deductive proofs, which AI can be trained on. The next frontier is developing a language to capture mathematical *strategy*—how to assess a conjecture's plausibility or choose a promising path. This would help automate the intuitive, creative part of mathematical discovery.

Terence Tao – Kepler, Newton, and the true nature of mathematical discovery

Dwarkesh Podcast·3 months ago

Get your free personalized podcast brief

Related Insights