The small size of the human genome is a puzzle. The solution may be that evolution doesn't store a large "pre-trained model." Instead, it uses the limited genomic space to encode a complex set of reward and loss functions, which is a far more compact way to guide a powerful learning algorithm.
Even with vast training data, current AI models are far less sample-efficient than humans. This limits their ability to adapt and learn new skills on the fly. They resemble a perpetual new hire who can access information but lacks the deep, instinctual learning that comes from experience and weight updates.
Modern LLMs use a simple form of reinforcement learning that directly rewards successful outcomes. This contrasts with more sophisticated methods, like those in AlphaGo or the brain, which use "value functions" to estimate long-term consequences. It's a mystery why the simpler approach is so effective.
It is a profound mystery how evolution hardcodes abstract social desires (e.g., reputation) into our genome. Unlike simple sensory rewards, these require complex cognitive processing to even identify. Solving this could unlock powerful new methods for instilling robust, high-level values in AI systems.
Attempting to interpret every learned circuit in a complex neural network is a futile effort. True understanding comes from describing the system's foundational elements: its architecture, learning rule, loss functions, and the data it was trained on. The emergent complexity is a result of this process.
In humans, learning a new skill is a highly conscious process that becomes unconscious once mastered. This suggests a link between learning and consciousness. The error signals and reward functions in machine learning could be computational analogues to the valenced experiences (pain/pleasure) that drive biological learning.
Frances Arnold, an engineer by training, reframed biological evolution as a powerful optimization algorithm. Instead of a purely biological concept, she saw it as a process for iterative design that could be harnessed in the lab to build new enzymes far more effectively than traditional methods.
Andre Karpathy argues that comparing AI to animal learning is flawed because animal brains possess powerful initializations encoded in DNA via evolution. This allows complex behaviors almost instantly (e.g., a newborn zebra running), which contradicts the 'tabula rasa' or 'blank slate' approach of many AI models.
Emotions act as a robust, evolutionarily-programmed value function guiding human decision-making. The absence of this function, as seen in brain damage cases, leads to a breakdown in practical agency. This suggests a similar mechanism may be crucial for creating effective and stable AI agents.
The Fetus GPT experiment reveals that while its model struggles with just 15MB of text, a human child learns language and complex concepts from a similarly small dataset. This highlights the incredible data and energy efficiency of the human brain compared to large language models.
AI models use simple, mathematically clean loss functions. The human brain's superior learning efficiency might stem from evolution hard-coding numerous, complex, and context-specific loss functions that activate at different developmental stages, creating a sophisticated learning curriculum.