AI systems from companies like Meta and OpenAI rely on a vast, unseen workforce of data labelers in developing nations. These communities perform the crucial but low-paid labor that powers modern AI, yet they are often the most marginalized and least likely to benefit from the technology they help build.

Related Insights

Early AI training involved simple preference tasks. Now, training frontier models requires PhDs and top professionals to perform complex, hours-long tasks like building entire websites or explaining nuanced cancer topics. The demand is for deep, specialized expertise, not just generalist labor.

The economic incentive for VCs funding AI is replacing human labor, a $13 trillion market in the US alone. This dwarfs the $300 billion SaaS market, revealing the ultimate goal is automating knowledge work, not just building software.

Companies like OpenAI and Anthropic are spending billions creating simulated enterprise apps (RL gyms) where human experts train AI models on complex tasks. This has created a new, rapidly growing "AI trainer" job category, but its ultimate purpose is to automate those same expert roles.

Mercore's $500M revenue in 17 months highlights a shift in AI training. The focus is moving from low-paid data labelers to a marketplace of elite experts like doctors and lawyers providing high-quality, nuanced data. This creates a new, lucrative gig economy for top-tier professionals.

The market reality is that consumers and businesses prioritize the best-performing AI models, regardless of whether their training data was ethically sourced. This dynamic incentivizes labs to use all available data, including copyrighted works, and treat potential fines as a cost of doing business.

The enormous market caps of leading AI companies can only be justified by finding trillions of dollars in efficiencies. This translates directly into a required labor destruction of roughly 10 million jobs, or 12.5% of the vulnerable workforce, suggesting market turmoil or mass unemployment is inevitable.

Frame AI not as a tool, but as a wave of "digital immigrants" with superhuman cognitive abilities. Similar to how the NAFTA trade agreement outsourced manufacturing, AI will outsource knowledge work. This will create abundance for some but risks hollowing out the middle class and social fabric.

The concept of data colonialism—extracting value from a population's data—is no longer limited to the Global South. It now applies to creative professionals in Western countries whose writing, music, and art are scraped without consent to build generative AI systems, concentrating wealth and power in the hands of a few tech firms.

While data labeling companies show massive revenue growth, their customer base is often limited to a few frontier AI labs. This creates a lopsided market where providers have little leverage, compete on price, and are heavily dependent on a handful of clients, making the ecosystem potentially unstable.

Data is becoming more expensive not from scarcity, but because the work has evolved. Simple labeling is over. Costs are now driven by the need for pricey domain experts for specialized data preparation and creative teams to build complex, synthetic environments for training agents.