AI's Math Progress Is Spiky, Excelling at Geometry but Struggling with Creative Combinatorics

Related Insights

AI Isn't Getting Smarter Linearly; It Has a "Jagged" Intelligence Profile

AI intelligence shouldn't be measured with a single metric like IQ. AIs exhibit "jagged intelligence," being superhuman in specific domains (e.g., mastering 200 languages) while simultaneously lacking basic capabilities like long-term planning, making them fundamentally unlike human minds.

Creator of AI: We Have 2 Years Before Everything Changes! These Jobs Won't Exist in 24 Months!

The Diary Of A CEO with Steven Bartlett·6 months ago

Andrej Karpathy: AI Excels at Verifiable Tasks, Explaining its 'Jagged Frontier'

Andrej Karpathy's 'Software 2.0' framework posits that AI automates tasks that are easily *verifiable*. This explains the 'jagged frontier' of AI progress: fields like math and code, where correctness is verifiable, advance rapidly. In contrast, creative and strategic tasks, where success is subjective and hard to verify, lag significantly behind.

Bezos Launches AI Startup, GPT-4o Debate, LeCun’s LLM Revolt | Eric Glyman, Stacy Rasgon, Luca Ferrari, Healey Cypher, John Tenet, Reed Duchscher

TBPN·7 months ago

AI Intelligence is Not a Single Scalar; It's a "Spiky" Profile of Genius and Incompetence

Progress towards AGI is not a smooth climb. Models exhibit "spikiness"—they can perform at a world-class level on one narrow domain but degrade to a "bad high school student" with slight perturbations. This non-intuitive generalization makes their capabilities uneven and unpredictable.

AI for Atoms: How Periodic Labs is Revolutionizing Materials Engineering with Co-Founder Liam Fedus

No Priors: Artificial Intelligence | Technology | Startups·3 months ago

AI Progress Is Unpredictable, With Breakthroughs in Niche Areas Like Math While Practical Agents Stall

The advancement of AI is not linear. While the industry anticipated a "year of agents" for practical assistance, the most significant recent progress has been in specialized, academic fields like competitive mathematics. This highlights the unpredictable nature of AI development.

Jack Morris on Finding the Next Big AI Breakthrough

Odd Lots·9 months ago

OpenAI's General-Purpose AI Solves Erdős Problem, Signaling a Leap in Reasoning

An internal, general-purpose OpenAI model solved a famous combinatorial geometry problem without specialized training or scaffolding. Unlike task-specific AIs, this achievement demonstrates a significant advance in abstract reasoning, suggesting models are progressing towards more general intelligence faster than anticipated.

SpaceX S-1, Anthropic Revenue Booms, OpenAI Cracks Erdős Problem | Diet TBPN

TBPN·a month ago

AI's 'Jagged Frontier' Means Models Can Win Math Olympiads But Can't Reliably Tell Time

The Stanford AI Index reveals a "jagged frontier" where advanced models achieve superhuman performance on complex tasks like the International Mathematical Olympiad, yet fail at simple, common-sense activities like reading an analog clock. This highlights their lack of real-world grounding and the need for more holistic "world models."

Breaking down the 2026 Stanford AI Index Report

Practical AI·a month ago

The Next AI Benchmark in Math Isn't Solving Problems, It's Generating Important Conjectures

Moving beyond solving existing problems like the Millennium Prize problems, the true test of advanced AI in mathematics will be its ability to generate novel, interesting conjectures and create new, unifying definitions. This represents a higher tier of mathematical creativity, akin to the work of the greatest mathematicians who frame the questions for others to solve.

Grant Sanderson – AI and the future of math

Dwarkesh Podcast·13 hours ago

Advancing AI for Math Requires a New Formal Language for Strategy and Plausibility

We have formal languages like Lean for deductive proofs, which AI can be trained on. The next frontier is developing a language to capture mathematical *strategy*—how to assess a conjecture's plausibility or choose a promising path. This would help automate the intuitive, creative part of mathematical discovery.

Terence Tao – Kepler, Newton, and the true nature of mathematical discovery

Dwarkesh Podcast·3 months ago

AI's 'Jagged' Performance Explains Public Disagreement on Its Usefulness

Frontier AI models exhibit 'jagged' capabilities, excelling at highly complex tasks like theoretical physics while failing at basic ones like counting objects. This inconsistent, non-human-like performance profile is a primary reason for polarized public and expert opinions on AI's actual utility.

Inside The Second International AI Safety Report with Writers Stephen Clare and Stephen Casper

The AI Policy Podcast·5 months ago

AI's 'Jagged Frontier': Superhuman at Coding, Childlike at Telling Jokes

AI models exhibit a "jaggedness" where capabilities are not uniform. They perform at expert levels on verifiable, RL-tuned tasks but remain basic on subjective, unoptimized ones (like humor). This suggests intelligence isn't generalizing smoothly across all domains.

Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

No Priors: Artificial Intelligence | Technology | Startups·3 months ago

Get your free personalized podcast brief

Related Insights