/

© 2026 RiffOn. All rights reserved.

Latent Space: The AI Engineer Podcast
After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast · Nov 25, 2025

World Labs founders Fei-Fei Li & Justin Johnson introduce "spatial intelligence" as the next AI frontier, moving beyond LLMs with their 3D world model, Marble.

The Entire History of Deep Learning Is a Story of Scaling Compute

The progression from early neural networks to today's massive models is fundamentally driven by the exponential increase in available computational power, from the initial move to GPUs to today's million-fold increases in training capacity on a single model.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs thumbnail

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·3 months ago

Transformer Models Natively Operate on Sets, Not Sequences

A common misconception is that Transformers are sequential models like RNNs. Fundamentally, they are permutation-equivariant and operate on sets of tokens. Sequence information is artificially injected via positional embeddings, making the architecture inherently flexible for non-linear data like 3D scenes or graphs.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs thumbnail

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·3 months ago

Stanford and Google Independently Invented Image Captioning at the Same Time

Fei-Fei Li's lab believed they were the first to combine ConvNets and LSTMs for image captioning, only to discover through a journalist that a team at Google had developed the same breakthrough concurrently. This highlights the phenomenon of parallel innovation in scientific research.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs thumbnail

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·3 months ago

AI's Next Frontier Is Spatial Intelligence, A Capability Distinct from Language

Human intelligence is multifaceted. While LLMs excel at linguistic intelligence, they lack spatial intelligence—the ability to understand, reason, and interact within a 3D world. This capability, crucial for tasks from robotics to scientific discovery, is the focus for the next wave of AI models.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs thumbnail

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·3 months ago

Academia's Role in AI Is Shifting From Scaling Models to Exploring 'Wacky Ideas'

With industry dominating large-scale model training, academic labs can no longer compete on compute. Their new strategic advantage lies in pursuing unconventional, high-risk ideas, new algorithms, and theoretical underpinnings that large commercial labs might overlook.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs thumbnail

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·3 months ago

GPU Performance-Per-Watt Is Plateauing, Demanding New Architectures

The performance gains from Nvidia's Hopper to Blackwell GPUs come from increased size and power, not efficiency. This signals a potential scaling limit, creating an opportunity for radically new hardware primitives and neural network architectures beyond today's matrix-multiplication-centric models.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs thumbnail

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·3 months ago

Humans Underappreciate Vision's Complexity Because It Feels Evolutionarily Effortless

Vision, a product of 540 million years of evolution, is a highly complex process. However, because it's an innate, effortless ability for humans, we undervalue its difficulty compared to language, which requires conscious effort to learn. This bias impacts how we approach building AI systems.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs thumbnail

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·3 months ago

Large Models Can Predict Orbits But Fail to Grasp Causal Laws of Gravity

A Harvard study showed LLMs can predict planetary orbits (pattern fitting) but generate nonsensical force vectors when probed. This reveals a critical gap: current models mimic data patterns but don't develop a true, generalizable understanding of underlying physical laws, separating them from human intelligence.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs thumbnail

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·3 months ago