Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

AI models exhibit a "jaggedness" where capabilities are not uniform. They perform at expert levels on verifiable, RL-tuned tasks but remain basic on subjective, unoptimized ones (like humor). This suggests intelligence isn't generalizing smoothly across all domains.

Related Insights

AI models are surprisingly strong at certain tasks but bafflingly weak at others. This 'jagged frontier' of capability means that experience with AI can be inconsistent. The only way to navigate it is through direct experimentation within one's own domain of expertise.

AI intelligence shouldn't be measured with a single metric like IQ. AIs exhibit "jagged intelligence," being superhuman in specific domains (e.g., mastering 200 languages) while simultaneously lacking basic capabilities like long-term planning, making them fundamentally unlike human minds.

The argument that AI models have uneven ('jagged') capabilities is a weak safety guarantee. Geoffrey Irving notes that as models improve, even their weakest performance areas will likely exceed top human abilities, making the overall system superhumanly capable despite internal inconsistencies.

AI's capabilities are highly uneven. Models are already superhuman in specific domains like speaking 150 languages or possessing encyclopedic knowledge. However, they still fail at tasks typical humans find easy, such as continual learning or nuanced visual reasoning like understanding perspective in a photo.

Current AI models resemble a student who grinds 10,000 hours on a narrow task. They achieve superhuman performance on benchmarks but lack the broad, adaptable intelligence of someone with less specific training but better general reasoning. This explains the gap between eval scores and real-world utility.

Demis Hassabis explains that current AI models have 'jagged intelligence'—performing at a PhD level on some tasks but failing at high-school level logic on others. He identifies this lack of consistency as a primary obstacle to achieving true Artificial General Intelligence (AGI).

Today's AI systems exhibit "jagged intelligence"—strong performance on many tasks but inconsistent reliability on others. This prevents full job replacement because being 95% effective is insufficient when the remaining 5% involves crucial edge cases, judgment, and discretion that still require human oversight.

A practical definition of AGI is its capacity to function as a 'drop-in remote worker,' fully substituting for a human on long-horizon tasks. Today's AI, despite genius-level abilities in narrow domains, fails this test because it cannot reliably string together multiple tasks over extended periods, highlighting the 'jagged frontier' of its abilities.

Frontier AI models exhibit 'jagged' capabilities, excelling at highly complex tasks like theoretical physics while failing at basic ones like counting objects. This inconsistent, non-human-like performance profile is a primary reason for polarized public and expert opinions on AI's actual utility.

Current AI models exhibit "jagged intelligence," performing at a PhD level on some tasks but failing at simple ones. Google DeepMind's CEO identifies this inconsistency and lack of reliability as a primary barrier to achieving true, general-purpose AGI.