Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Achieving state-of-the-art AI performance requires a massive, bespoke data generation process. This involves thousands of human experts—from legal specialists to management consultants—creating specific examples, rubrics, and chain-of-thought explanations, forming a new and rapidly growing data industry that is the true engine of progress.

Related Insights

AI startup Mercore's valuation quintupled to $10B by connecting AI labs with domain experts to train models. This reveals that the most critical bottleneck for advanced AI is not just data or compute, but reinforcement learning from highly skilled human feedback, creating a new "RL economy."

Early AI training involved simple preference tasks. Now, training frontier models requires PhDs and top professionals to perform complex, hours-long tasks like building entire websites or explaining nuanced cancer topics. The demand is for deep, specialized expertise, not just generalist labor.

LLMs have hit a wall by scraping nearly all available public data. The next phase of AI development and competitive differentiation will come from training models on high-quality, proprietary data generated by human experts. This creates a booming "data as a service" industry for companies like Micro One that recruit and manage these experts.

Previously, compute and data were the limiting factors in AI development. Now, the challenge is scaling the generation of high-quality, human-expert data needed to train frontier models for complex cognitive tasks that go beyond simply processing the public internet.

The era of simple data labeling is over. Frontier AI models now require complex, expert-generated data to break current capabilities and advance research. Data providers like Turing now act as strategic research partners to AI labs, not just data factories.

While AI has mastered verifiable tasks with clear right answers, its future growth depends on human experts training models in subjective fields where 'good' is not easily defined. Companies are now sourcing professionals to act as 'verifiers' that teach AI nuanced, domain-specific judgment.

With the public internet fully indexed, LLMs now require net-new, high-fidelity data to improve. This has created a booming market for domain experts in fields like law, finance, and medicine to work as freelance "AI trainers." This new job category involves creating complex, proprietary data sets, often for high compensation.

A massive opportunity for AI lies in unearthing and recording experts' tacit, unwritten knowledge—the "knack" for doing things that is lost when they die. This "dark data," once fed into models, will unlock immense, currently inaccessible value.

Mercore's $500M revenue in 17 months highlights a shift in AI training. The focus is moving from low-paid data labelers to a marketplace of elite experts like doctors and lawyers providing high-quality, nuanced data. This creates a new, lucrative gig economy for top-tier professionals.

AI models have absorbed the internet's general knowledge, so the new bottleneck is correcting complex, domain-specific reasoning. This creates a market for specialists (e.g., physicists, accountants) to provide 'post-training' human feedback on subtle errors.

Frontier AI Models Are Powered by a Hidden Decabillion-Dollar Industry of Human Experts | RiffOn