We scan new podcasts and send you the top 5 insights daily.
With model intelligence advancing, the next hurdles for perfect search are operational. First, building infrastructure to handle a 1000x increase in agent-driven queries. Second, the "data bottleneck" of capturing and indexing vast information that exists only offline.
The industry has already exhausted the public web data used to train foundational AI models, a point underscored by the phrase "we've already run out of data." The next leap in AI capability and business value will come from harnessing the vast, proprietary data currently locked behind corporate firewalls.
The effectiveness of AI agents is fundamentally limited by their data inputs. In the agent era, access to clean and structured web data is no longer a commodity but a critical piece of infrastructure, making tools that provide it immensely valuable. AI models have brains but are blind without this data.
The success of LLMs, driven by the "bitter lesson" that scale is paramount, isn't unique to language. The same principles—pre-training, post-training, and reinforcement learning—can be applied to search models to achieve breakthrough performance in information retrieval.
In 2001, Google realized its combined server RAM could hold a full copy of its web index. Moving from disk-based to in-memory systems eliminated slow disk seeks, enabling complex queries with synonyms and semantic expansion. This fundamentally improved search quality long before LLMs became mainstream.
Unlike humans who type 2-3 words, LLMs generate long, sentence-like queries (e.g., eight words or more) to gather comprehensive context. This shift in user behavior from human to AI requires search engines to be optimized for these detailed, descriptive inputs.
The focus in AI has evolved from rapid software capability gains to the physical constraints of its adoption. The demand for compute power is expected to significantly outstrip supply, making infrastructure—not algorithms—the defining bottleneck for future growth.
AI agents, unlike humans, need complete and exhaustive information (thousands of results) and use complex, controllable queries. A search engine built for human keyword simplicity and limited results will fail to serve them effectively.
For years, access to compute was the primary bottleneck in AI development. Now, as public web data is largely exhausted, the limiting factor is access to high-quality, proprietary data from enterprises and human experts. This shifts the focus from building massive infrastructure to forming data partnerships and expertise.
Previously, the biggest constraint in AI was compute for training next-gen models. Now, the critical bottleneck is providing enough compute for *inference*—the real-time processing of queries from a rapidly growing user base.
The nature of Retrieval-Augmented Generation (RAG) is evolving. Instead of a single search to populate an initial context window, AI agents are now performing numerous concurrent queries in a single turn. This allows them to explore diverse information paths simultaneously, driving new database requirements.