With the public internet fully indexed, LLMs now require net-new, high-fidelity data to improve. This has created a booming market for domain experts in fields like law, finance, and medicine to work as freelance "AI trainers." This new job category involves creating complex, proprietary data sets, often for high compensation.

Related Insights

AI startup Mercore's valuation quintupled to $10B by connecting AI labs with domain experts to train models. This reveals that the most critical bottleneck for advanced AI is not just data or compute, but reinforcement learning from highly skilled human feedback, creating a new "RL economy."

LLMs have hit a wall by scraping nearly all available public data. The next phase of AI development and competitive differentiation will come from training models on high-quality, proprietary data generated by human experts. This creates a booming "data as a service" industry for companies like Micro One that recruit and manage these experts.

To move beyond general knowledge, AI firms are creating a new role: the "AI Trainer." These are not contractors but full-time employees, typically PhDs with deep domain expertise and a computer science interest, tasked with systematically improving model competence in specific fields like physics or mathematics.

The era of simple data labeling is over. Frontier AI models now require complex, expert-generated data to break current capabilities and advance research. Data providers like Turing now act as strategic research partners to AI labs, not just data factories.

Instead of repeatedly performing tasks, knowledge workers will train AI agents by creating "evals"—data sets that teach the AI how to handle specific workflows. This fundamental shift means the economy will transition from paying for human execution to paying for human training data.

For years, access to compute was the primary bottleneck in AI development. Now, as public web data is largely exhausted, the limiting factor is access to high-quality, proprietary data from enterprises and human experts. This shifts the focus from building massive infrastructure to forming data partnerships and expertise.

Companies like OpenAI and Anthropic are spending billions creating simulated enterprise apps (RL gyms) where human experts train AI models on complex tasks. This has created a new, rapidly growing "AI trainer" job category, but its ultimate purpose is to automate those same expert roles.

Mercore's $500M revenue in 17 months highlights a shift in AI training. The focus is moving from low-paid data labelers to a marketplace of elite experts like doctors and lawyers providing high-quality, nuanced data. This creates a new, lucrative gig economy for top-tier professionals.

Data is becoming more expensive not from scarcity, but because the work has evolved. Simple labeling is over. Costs are now driven by the need for pricey domain experts for specialized data preparation and creative teams to build complex, synthetic environments for training agents.

By paying over 100 former Wall Street bankers to train its models on complex financial tasks, OpenAI is creating a template for vertical AI dominance. This 'expert-as-a-contractor' model will be replicated across law, accounting, and consulting to systematically automate lucrative knowledge work sectors.