We scan new podcasts and send you the top 5 insights daily.
Today's AI, particularly neural networks, stems from a long tradition in cognitive science where psychologists used mathematical models to understand human thought. Key advances in neural nets were made by researchers trying to replicate how human minds work, not just build intelligent machines.
DeepMind's core breakthrough was treating AI like a child, not a machine. Instead of programming complex strategies, they taught it to master tasks through simple games like Pong, giving it only one rule ('score go up is good') and allowing it to learn for itself through trial and error.
The hypothesis for ImageNet—that computers could learn to "see" from vast visual data—was sparked by Dr. Li's reading of psychology research on how children learn. This demonstrates that radical innovation often emerges from the cross-pollination of ideas from seemingly unrelated fields.
Just as biology deciphers the complex systems created by evolution, mechanistic interpretability seeks to understand the "how" inside neural networks. Instead of treating models as black boxes, it examines their internal parameters and activations to reverse-engineer how they work, moving beyond just measuring their external behavior.
To make genuine scientific breakthroughs, an AI needs to learn the abstract reasoning strategies and mental models of expert scientists. This involves teaching it higher-level concepts, such as thinking in terms of symmetries, a core principle in physics that current models lack.
The development of neural networks wasn't a linear path. It involved a cycle where computer scientists and psychologists alternately abandoned and revived the concept. When one discipline hit a wall or lost interest, researchers in the other field would pick it up, solve a key problem, and reignite progress.
The ultimate goal isn't just modeling specific systems (like protein folding), but automating the entire scientific method. This involves AI generating hypotheses, choosing experiments, analyzing results, and updating a 'world model' of a domain, creating a continuous loop of discovery.
The current AI boom isn't a sudden, dangerous phenomenon. It's the culmination of 80 years of research since the first neural network paper in 1943. This long, steady progress counters the recent media-fueled hysteria about AI's immediate dangers.
The computer industry originally chose a "hyper-literal mathematical machine" path over a "human brain model" based on neural networks, a theory that existed since the 1940s. The current AI wave represents the long-delayed success of that alternate, abandoned path.
AI models use simple, mathematically clean loss functions. The human brain's superior learning efficiency might stem from evolution hard-coding numerous, complex, and context-specific loss functions that activate at different developmental stages, creating a sophisticated learning curriculum.
A neuroscientist-led startup is growing live neurons on electrodes not just for compute efficiency, but as a platform to discover novel algorithms. By studying how biological networks process information, they identify neuroscience principles that can be used as software plugins to improve current AI models and find successors to the transformer architecture.