Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

The success of LLMs, driven by the "bitter lesson" that scale is paramount, isn't unique to language. The same principles—pre-training, post-training, and reinforcement learning—can be applied to search models to achieve breakthrough performance in information retrieval.

Related Insights

The "Bitter Lesson" is not just about using more compute, but leveraging it scalably. Current LLMs are inefficient because they only learn during a discrete training phase, not during deployment where most computation occurs. This reliance on a special, data-intensive training period is not a scalable use of computational resources.

With model intelligence advancing, the next hurdles for perfect search are operational. First, building infrastructure to handle a 1000x increase in agent-driven queries. Second, the "data bottleneck" of capturing and indexing vast information that exists only offline.

Instead of using massive, expensive LLMs for every task, companies can solve the "tokenpocalypse" (runaway token costs) by pairing smaller models with high-quality retrieval systems. This allows cheap models to act like large ones, saving significant costs.

The original playbook of simply scaling parameters and data is now obsolete. Top AI labs have pivoted to heavily designed post-training pipelines, retrieval, tool use, and agent training, acknowledging that raw scaling is insufficient to solve real-world problems.

Reinforcement learning achieves superhuman results not by inventing alien concepts, but by surfacing and combining rare behaviors that are already possible within a model's vast pre-trained distribution. The goal of pre-training is to make this search for novel solutions more efficient and less random.

Google's moats (human click data, large re-ranking teams) are less relevant for AI agents. LLMs allow small, agile teams to build superior search products by training their own models without needing decades of user signal data.

In 2001, Google realized its combined server RAM could hold a full copy of its web index. Moving from disk-based to in-memory systems eliminated slow disk seeks, enabling complex queries with synonyms and semantic expansion. This fundamentally improved search quality long before LLMs became mainstream.

A key surprise in AI development was the non-linear impact of scale. Sebastian Thrun noted that while AI trained on millions of documents is 'fine,' training it on hundreds of billions creates an 'unbelievably smart' system, shocking even its creators and demonstrating data volume as a primary driver of breakthroughs.

Perplexity leverages its user-facing product to improve its core search technology. When the LLM reasons through search snippets and selects which ones to cite in an answer, that selection process acts as a powerful signal to refine and improve the underlying search ranking algorithm for future queries.

Yahoo built its AI search engine, Scout, not by training a massive model, but by using a smaller, affordable LLM (Anthropic's Haiku) as a processing layer. The real power comes from feeding this model Yahoo's 30 years of proprietary search data and knowledge graphs.