Neural Networks Are Computationally Analogous to Threshold Circuits in Complexity Theory

Related Insights

Neural Networks Find Practical Solutions to NP-Hard Problems, Questioning Worst-Case Complexity Theory

The success of neural networks on problems like Go and protein folding, long considered intractable NP-hard problems, is profound. It suggests our formal understanding of computational hardness, which focuses on worst-case scenarios, may be an incomplete model for how to find useful, approximate solutions in practice.

Eric Jang – Building AlphaGo from scratch

Dwarkesh Podcast·a month ago

Cryptographic "Feistel Ciphers" Help Train Neural Networks by Reducing Memory Usage

A technique from cryptography, the Feistel network, makes any function invertible. When applied to neural network layers ("RevNets"), it allows activations from the forward pass to be re-calculated during the backward pass instead of stored. This trades extra compute for a massive reduction in memory footprint during training.

Reiner Pope – The math behind how LLMs are trained and served

Dwarkesh Podcast·2 months ago

Modern AI's Breakthroughs Originated from Psychologists Modeling the Human Mind

Today's AI, particularly neural networks, stems from a long tradition in cognitive science where psychologists used mathematical models to understand human thought. Key advances in neural nets were made by researchers trying to replicate how human minds work, not just build intelligent machines.

What AI Can Teach You About Your Brain

The Next Big Idea Daily·4 months ago

The Entire History of Deep Learning Is a Story of Scaling Compute

The progression from early neural networks to today's massive models is fundamentally driven by the exponential increase in available computational power, from the initial move to GPUs to today's million-fold increases in training capacity on a single model.

After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Latent Space: The AI Engineer Podcast·7 months ago

To Understand a Neural Network, Focus on Its Training Process, Not Its Final Weights

Attempting to interpret every learned circuit in a complex neural network is a futile effort. True understanding comes from describing the system's foundational elements: its architecture, learning rule, loss functions, and the data it was trained on. The emergent complexity is a result of this process.

Adam Marblestone – AI is missing something fundamental about the brain

Dwarkesh Podcast·6 months ago

Pathway's BDH Model Uses Brain-Like 'Sparse Activations' for Efficient Reasoning

Unlike transformers which use dense activations (firing most neurons), Pathway's BDH architecture uses sparse positive activations, where only ~5% of neurons fire at once. This approach is more biologically plausible, mimicking the human brain's energy efficiency and enabling complex reasoning without the massive computational overhead of dense models.

A Post-Transformer Architecture Crushes Sudoku (Transformers Solve ~0%)

Super Data Science: ML & AI Podcast with Jon Krohn·3 months ago

Mechanistic Interpretability Aims to Be for AI What Biology Is for Evolution

Just as biology deciphers the complex systems created by evolution, mechanistic interpretability seeks to understand the "how" inside neural networks. Instead of treating models as black boxes, it examines their internal parameters and activations to reverse-engineer how they work, moving beyond just measuring their external behavior.

2025 Highlight-o-thon: Oops! All Bests

80,000 Hours Podcast·6 months ago

AI Chips' Core Operation is Multiply-Accumulate, Directly Mirroring Matrix Math

The fundamental primitive for AI chips isn't arbitrary; it's the multiply-accumulate (MAC) operation. This is because it directly maps to the innermost computational loop of matrix multiplication (output += input1 * input2), which is the foundational computation for most neural networks.

Reiner Pope – Chip design from the bottom up

Dwarkesh Podcast·a month ago

A Small Neural Network Can Amortize a Vast Search, Compressing Deep Simulation into One Glance

A key insight from AlphaGo is that a relatively shallow neural network can approximate the result of an incredibly deep and complex search tree. This suggests neural nets can learn to compress sequential, recursive computation into a single, efficient forward pass.

Eric Jang – Building AlphaGo from scratch

Dwarkesh Podcast·a month ago

Future Hardware May Demand Neural Networks Built on Primitives Beyond Matrix Multiplication

Today's transformers are optimized for matrix multiplication (MatMul) on GPUs. However, as compute scales to distributed clusters, MatMul may not be the most efficient primitive. Future AI architectures could be drastically different, built on new primitives better suited for large-scale, distributed hardware.

What Comes After ChatGPT? The Mother of ImageNet Predicts The Future

a16z Podcast·7 months ago

Get your free personalized podcast brief

Related Insights