Early AI training involved simple preference tasks. Now, training frontier models requires PhDs and top professionals to perform complex, hours-long tasks like building entire websites or explaining nuanced cancer topics. The demand is for deep, specialized expertise, not just generalist labor.

Related Insights

AI startup Mercore's valuation quintupled to $10B by connecting AI labs with domain experts to train models. This reveals that the most critical bottleneck for advanced AI is not just data or compute, but reinforcement learning from highly skilled human feedback, creating a new "RL economy."

LLMs have hit a wall by scraping nearly all available public data. The next phase of AI development and competitive differentiation will come from training models on high-quality, proprietary data generated by human experts. This creates a booming "data as a service" industry for companies like Micro One that recruit and manage these experts.

To move beyond general knowledge, AI firms are creating a new role: the "AI Trainer." These are not contractors but full-time employees, typically PhDs with deep domain expertise and a computer science interest, tasked with systematically improving model competence in specific fields like physics or mathematics.

The era of advancing AI simply by scaling pre-training is ending due to data limits. The field is re-entering a research-heavy phase focused on novel, more efficient training paradigms beyond just adding more compute to existing recipes. The bottleneck is shifting from resources back to ideas.

Current AI models resemble a student who grinds 10,000 hours on a narrow task. They achieve superhuman performance on benchmarks but lack the broad, adaptable intelligence of someone with less specific training but better general reasoning. This explains the gap between eval scores and real-world utility.

In a group of 100 experts training an AI, the top 10% will often drive the majority of the model's improvement. This creates a power law dynamic where the ability to source and identify this elite talent becomes a key competitive moat for AI labs and data providers.

Companies like OpenAI and Anthropic are spending billions creating simulated enterprise apps (RL gyms) where human experts train AI models on complex tasks. This has created a new, rapidly growing "AI trainer" job category, but its ultimate purpose is to automate those same expert roles.

Mercore's $500M revenue in 17 months highlights a shift in AI training. The focus is moving from low-paid data labelers to a marketplace of elite experts like doctors and lawyers providing high-quality, nuanced data. This creates a new, lucrative gig economy for top-tier professionals.

As reinforcement learning (RL) techniques mature, the core challenge shifts from the algorithm to the problem definition. The competitive moat for AI companies will be their ability to create high-fidelity environments and benchmarks that accurately represent complex, real-world tasks, effectively teaching the AI what matters.

Data is becoming more expensive not from scarcity, but because the work has evolved. Simple labeling is over. Costs are now driven by the need for pricey domain experts for specialized data preparation and creative teams to build complex, synthetic environments for training agents.