Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

LLMs excel at learning correlations from vast data (Shannon entropy), like predicting the next random-looking digit of pi. However, they can't create the simple, elegant program that generates pi (Kolmogorov complexity). This represents the critical leap from correlation to true causal understanding.

Related Insights

A useful mental model for an LLM is a giant matrix where each row is a possible prompt and columns represent next-token probabilities. This matrix is impossibly large but also extremely sparse, as most token combinations are gibberish. The LLM's job is to efficiently compress and approximate this matrix.

A core debate in AI is whether LLMs, which are text prediction engines, can achieve true intelligence. Critics argue they cannot because they lack a model of the real world. This prevents them from making meaningful, context-aware predictions about future events—a limitation that more data alone may not solve.

LLMs shine when acting as a 'knowledge extruder'—shaping well-documented, 'in-distribution' concepts into specific code. They fail when the core task is novel problem-solving where deep thinking, not code generation, is the bottleneck. In these cases, the code is the easy part.

MIT research reveals that large language models develop "spurious correlations" by associating sentence patterns with topics. This cognitive shortcut causes them to give domain-appropriate answers to nonsensical queries if the grammatical structure is familiar, bypassing logical analysis of the actual words.

Judea Pearl, a foundational figure in AI, argues that Large Language Models (LLMs) are not on a path to Artificial General Intelligence (AGI). He states they merely summarize human-generated world models rather than discovering causality from raw data. He believes scaling up current methods will not overcome this fundamental mathematical limitation.

Current AI can learn to predict complex patterns, like planetary orbits, from data. However, it struggles to abstract the underlying causal laws, such as Newtonian physics (F=MA). This leap to a higher level of abstraction remains a fundamental challenge beyond simple pattern recognition.

Simply making LLMs larger will not lead to AGI. True advancement requires solving two distinct problems: 1) Plasticity, the ability to continually learn without "catastrophic forgetting," and 2) moving from correlation-based pattern matching to building causal models of the world.

While both humans and LLMs perform Bayesian updating, humans possess a critical additional capability: causal simulation. When a pen is thrown, a human simulates its trajectory to dodge it—a causal intervention. LLMs are stuck at the level of correlation and cannot perform these essential simulations.

AGI won't be achieved by pattern-matching existing knowledge. A real benchmark is whether a model can synthesize anomalous data (like Mercury's orbit) and create a fundamentally new representation of the universe, as Einstein did, moving beyond correlation to a new causal model.

A Harvard study showed LLMs can predict planetary orbits (pattern fitting) but generate nonsensical force vectors when probed. This reveals a critical gap: current models mimic data patterns but don't develop a true, generalizable understanding of underlying physical laws, separating them from human intelligence.

LLMs Master Correlation (Shannon Entropy) but Fail at Causal Leaps (Kolmogorov Complexity) | RiffOn