Contrary to the view that in-context learning is a distinct process from training, Karpathy speculates it might be an emergent form of gradient descent happening within the model's layers. He cites papers showing that transformers can learn to perform linear regression in-context, with internal mechanics that mimic an optimization loop.

Related Insights

Reinforcement learning incentivizes AIs to find the right answer, not just mimic human text. This leads to them developing their own internal "dialect" for reasoning—a chain of thought that is effective but increasingly incomprehensible and alien to human observers.

The current limitation of LLMs is their stateless nature; they reset with each new chat. The next major advancement will be models that can learn from interactions and accumulate skills over time, evolving from a static tool into a continuously improving digital colleague.

LLMs learn two things from pre-training: factual knowledge and intelligent algorithms (the "cognitive core"). Karpathy argues the vast memorized knowledge is a hindrance, making models rely on memory instead of reasoning. The goal should be to strip away this knowledge to create a pure, problem-solving cognitive entity.

Karpathy identifies a key missing piece for continual learning in AI: an equivalent to sleep. Humans seem to use sleep to distill the day's experiences (their "context window") into the compressed weights of the brain. LLMs lack this distillation phase, forcing them to restart from a fixed state in every new session.

The current focus on pre-training AI with specific tool fluencies overlooks the crucial need for on-the-job, context-specific learning. Humans excel because they don't need pre-rehearsal for every task. This gap indicates AGI is further away than some believe, as true intelligence requires self-directed, continuous learning in novel environments.

The distinction between imitation learning and reinforcement learning (RL) is not a rigid dichotomy. Next-token prediction in LLMs can be framed as a form of RL where the "episode" is just one token long and the reward is based on prediction accuracy. This conceptual model places both learning paradigms on a continuous spectrum rather than in separate categories.

Overloading LLMs with excessive context degrades performance, a phenomenon known as 'context rot'. Claude Skills address this by loading context only when relevant to a specific task. This laser-focused approach improves accuracy and avoids the performance degradation seen in broader project-level contexts.

The 2017 introduction of "transformers" revolutionized AI. Instead of being trained on the specific meaning of each word, models began learning the contextual relationships between words. This allowed AI to predict the next word in a sequence without needing a formal dictionary, leading to more generalist capabilities.

Unlike humans, whose poor memory forces them to generalize and find patterns, LLMs are incredibly good at memorization. Karpathy argues this is a flaw. It distracts them with recalling specific training documents instead of focusing on the underlying, generalizable algorithms of thought, hindering true understanding.

The key to continual learning is not just a longer context window, but a new architecture with a spectrum of memory types. "Nested learning" proposes a model with different layers that update at different frequencies—from transient working memory to persistent core knowledge—mimicking how humans learn without catastrophic forgetting.