AI models use simple, mathematically clean loss functions. The human brain's superior learning efficiency might stem from evolution hard-coding numerous, complex, and context-specific loss functions that activate at different developmental stages, creating a sophisticated learning curriculum.

Related Insights

LLMs predict the next token in a sequence. The brain's cortex may function as a general prediction engine capable of "omnidirectional inference"—predicting any missing information from any available subset of inputs, not just what comes next. This offers a more flexible and powerful form of reasoning.

The brain's hardware limitations, like slow and stochastic neurons, may actually be advantages. These properties seem perfectly suited for probabilistic inference algorithms that rely on sampling—a task that requires explicit, computationally-intensive random number generation in digital systems. Hardware and algorithm are likely co-designed.

Even with vast training data, current AI models are far less sample-efficient than humans. This limits their ability to adapt and learn new skills on the fly. They resemble a perpetual new hire who can access information but lacks the deep, instinctual learning that comes from experience and weight updates.

"Amortized inference" bakes slow, deliberative reasoning into a fast, single-pass model. While the brain uses a mix, digital minds have a strong incentive to amortize more capabilities. This is because once a capability is baked in, the resulting model can be copied infinitely, unlike a biological brain.

The small size of the human genome is a puzzle. The solution may be that evolution doesn't store a large "pre-trained model." Instead, it uses the limited genomic space to encode a complex set of reward and loss functions, which is a far more compact way to guide a powerful learning algorithm.

In humans, learning a new skill is a highly conscious process that becomes unconscious once mastered. This suggests a link between learning and consciousness. The error signals and reward functions in machine learning could be computational analogues to the valenced experiences (pain/pleasure) that drive biological learning.

Andre Karpathy argues that comparing AI to animal learning is flawed because animal brains possess powerful initializations encoded in DNA via evolution. This allows complex behaviors almost instantly (e.g., a newborn zebra running), which contradicts the 'tabula rasa' or 'blank slate' approach of many AI models.

Just as crawling is a vital developmental step for babies even though adults don't crawl, some learning processes that AI can automate might be essential for cognitive development. We shouldn't skip steps without understanding their underlying neurological purpose.

The Fetus GPT experiment reveals that while its model struggles with just 15MB of text, a human child learns language and complex concepts from a similarly small dataset. This highlights the incredible data and energy efficiency of the human brain compared to large language models.

A novel training method involves adding an auxiliary task for AI models: predicting the neural activity of a human observing the same data. This "brain-augmented" learning could force the model to adopt more human-like internal representations, improving generalization and alignment beyond what simple labels can provide.