Energy Based Models (EBMs) Offer a 'Bird's-Eye View' That Avoids the 'Tunnel Vision' of LLMs

Related Insights

LLMs Function as Compressed Representations of an Impossibly Large and Sparse Probability Matrix

A useful mental model for an LLM is a giant matrix where each row is a possible prompt and columns represent next-token probabilities. This matrix is impossibly large but also extremely sparse, as most token combinations are gibberish. The LLM's job is to efficiently compress and approximate this matrix.

What's Missing Between LLMs and AGI - Vishal Misra & Martin Casado

The a16z Show·4 months ago

The Human Cortex Performs Omnidirectional Inference, Unlike LLMs' Unidirectional Prediction

LLMs predict the next token in a sequence. The brain's cortex may function as a general prediction engine capable of "omnidirectional inference"—predicting any missing information from any available subset of inputs, not just what comes next. This offers a more flexible and powerful form of reasoning.

Adam Marblestone – AI is missing something fundamental about the brain

Dwarkesh Podcast·7 months ago

LLMs Lack True Creativity Because They Are Missing AlphaGo's Search Component

According to Demis Hassabis, LLMs feel uncreative because they only perform pattern matching. To achieve true, extrapolative creativity like AlphaGo's famous 'Move 37,' models must be paired with a search component that actively explores new parts of the knowledge space beyond the training data.

Best of Big Technology: Demis Hassabis On AGI, Deceptive AIs, Building a Virtual Cell

Big Technology Podcast·7 months ago

New AI Architectures Must Integrate with the LLM Ecosystem to Overcome Massive Incumbent Investment Inertia

Billions have been invested in the LLM data center and hardware ecosystem, creating a powerful inertia. For an alternative architecture like EBMs to succeed, it cannot demand a full replacement. Instead, it must position itself as a compatible layer that makes existing LLM investments cheaper and more effective for specific tasks like spatial reasoning.

The AI Model Built for What LLMs Can't Do

AI & I·3 months ago

Modern AI Training Is Not Just Next-Token Prediction Anymore

The argument that LLMs are just "stochastic parrots" is outdated. Current frontier models are trained via Reinforcement Learning, where the signal is not "did you predict the right token?" but "did you get the right answer?" This is based on complex, often qualitative criteria, pushing models beyond simple statistical correlation.

Success without Dignity? Nathan finds Hope Amidst Chaos, from The Intelligence Horizon Podcast

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·3 months ago

Pathway's BDH Model Uses Brain-Like 'Sparse Activations' for Efficient Reasoning

Unlike transformers which use dense activations (firing most neurons), Pathway's BDH architecture uses sparse positive activations, where only ~5% of neurons fire at once. This approach is more biologically plausible, mimicking the human brain's energy efficiency and enabling complex reasoning without the massive computational overhead of dense models.

A Post-Transformer Architecture Crushes Sudoku (Transformers Solve ~0%)

Super Data Science: ML & AI Podcast with Jon Krohn·4 months ago

Energy Based Models (EBMs) Can Be Formally Constrained, Preventing Unpredictable LLM 'Hallucinations'

Unlike LLMs, which can hallucinate and behave unpredictably in novel situations, EBMs have an architecture designed to be constrained. A human can define a set of rules or constraints, and the EBM is forced to follow them, making it a more reliable choice for mission-critical systems like autonomous vehicles or financial trading.

The AI Model Built for What LLMs Can't Do

AI & I·3 months ago

Logical Intelligence Argues LLM Reasoning Is Flawed Because It's Tethered to Specific Human Languages

LLMs' intelligence is dependent on the language they are trained on, meaning their reasoning process differs between, for example, English and French. This is unnatural for tasks like spatial reasoning, which are language-agnostic. EBMs operate on an abstract, token-free level, mapping information directly without a language-based intermediary.

The AI Model Built for What LLMs Can't Do

AI & I·3 months ago

Energy Based Models Use Physics' 'Energy Minimization' Principle to Find Optimal Solutions Without Sequential Guessing

EBMs are based on a fundamental principle in physics where systems naturally seek their lowest energy state (e.g., sitting on a couch when tired). The model maps all possible outcomes onto an 'energy landscape,' where the lowest points represent the most probable solutions. This avoids the expensive, token-by-token guessing game played by LLMs.

The AI Model Built for What LLMs Can't Do

AI & I·3 months ago

EBMs Build Inspectable 'Knowledge Stores' of World Rules, Overcoming the 'Black Box' Problem of LLMs

EBMs analyze data to understand its underlying rules, storing this knowledge in inspectable 'latent variables' in the form of an energy landscape. This contrasts with LLMs, which are black boxes where the reasoning process is opaque. With EBMs, you can observe the model's internal state in real-time to see what it has learned.

The AI Model Built for What LLMs Can't Do

AI & I·3 months ago

Get your free personalized podcast brief

Related Insights