ResNets' Inductive Bias Makes Them More Data-Efficient Than Transformers for Go AI

Related Insights

Neural Networks Find Practical Solutions to NP-Hard Problems, Questioning Worst-Case Complexity Theory

The success of neural networks on problems like Go and protein folding, long considered intractable NP-hard problems, is profound. It suggests our formal understanding of computational hardness, which focuses on worst-case scenarios, may be an incomplete model for how to find useful, approximate solutions in practice.

Eric Jang – Building AlphaGo from scratch

Dwarkesh Podcast·5 hours ago

AlphaGo Made Go Solvable by Using Neural Nets to Prune Its Intractable Search Tree

Go's search space is larger than the number of atoms in the universe, making exhaustive search impossible. AlphaGo's core breakthrough was using neural networks to intelligently guide its search, evaluating only the most promising moves and making an intractable problem solvable.

Eric Jang – Building AlphaGo from scratch

Dwarkesh Podcast·5 hours ago

AlphaGo's Success Combined 'Fast' Intuitive Neural Networks with 'Slow' Deliberate Search

AlphaGo's architecture mimicked human cognition by pairing a 'fast thinking' neural network for intuition with a 'slow thinking' search algorithm for explicit planning. This hybrid model, combining pattern recognition with calculation, proved more powerful for tackling complex problems than either approach alone.

10 Years of AlphaGo: The Turning Point for AI | Thore Graepel & Pushmeet Kohli

Google DeepMind: The Podcast·2 months ago

The Transformer Paper's Core Insight Was GPU Efficiency, Not Just Architectural Novelty

The "Attention is All You Need" paper's key breakthrough was an architecture designed for massive scalability across GPUs. This focus on efficiency, anticipating the industry's shift to larger models, was more crucial to its dominance than the attention mechanism itself.

Synthetic Data and the Future of AI | Cohere CEO Aidan Gomez

Grit·6 months ago

Moonlake Bets on "Structure and Scale" to Beat the Pure "Bitter Lesson" Approach

While acknowledging the power of scale, Moonlake argues that incorporating symbolic structure allows models to learn with orders of magnitude less data. This mirrors human cognition, which uses abstracted semantic descriptions rather than processing every pixel.

Moonlake: Causal World Models should be Multimodal, Interactive, and Efficient — with Chris Manning and Fan-yun Sun

Latent Space: The AI Engineer Podcast·a month ago

Specialized Architectures Still Beat Transformers for Protein Structure Prediction

Contrary to trends in other AI fields, structural biology problems are not yet dominated by simple, scaled-up transformers. Specialized architectures that bake in physical priors, like equivariance, still yield vastly superior performance, as the domain's complexity requires strong inductive biases.

🔬Beyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery

Latent Space: The AI Engineer Podcast·3 months ago

AlphaGo's Value Network Truncates Game Tree Search by Predicting Mid-Game Outcomes

Humans stop analyzing a game when they intuit a winning or losing position. AlphaGo’s value function mimics this by predicting the eventual outcome from any board state. This allows the search to be drastically shortened, as it doesn't need to play out every possibility to the very end.

Eric Jang – Building AlphaGo from scratch

Dwarkesh Podcast·5 hours ago

Meta-Learning Can Give Neural Networks the 'Head Start' Humans Have

To bridge the learning efficiency gap between humans and AI, researchers use meta-learning. This technique learns optimal initial weights for a neural network, giving it a "soft bias" that starts it closer to a good solution. This mimics the inherent inductive biases that allow humans to learn efficiently from limited data.

969: The Laws of Thought: The Math of Minds and Machines, with Prof. Tom Griffiths

Super Data Science: ML & AI Podcast with Jon Krohn·3 months ago

A Small Neural Network Can Amortize a Vast Search, Compressing Deep Simulation into One Glance

A key insight from AlphaGo is that a relatively shallow neural network can approximate the result of an incredibly deep and complex search tree. This suggests neural nets can learn to compress sequential, recursive computation into a single, efficient forward pass.

Eric Jang – Building AlphaGo from scratch

Dwarkesh Podcast·5 hours ago

Transformers Are Fundamentally Models of Sets, Not Sequences

Contrary to common perception shaped by their use in language, Transformers are not inherently sequential. Their core architecture operates on sets of tokens, with sequence information only injected via positional embeddings. This makes them powerful for non-sequential data like 3D objects or other unordered collections.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·6 months ago

Get your free personalized podcast brief

Related Insights