The primary bottleneck in improving AI is no longer data or compute, but the creation of 'evals'—tests that measure a model's capabilities. These evals act as product requirement documents (PRDs) for researchers, defining what success looks like and guiding the training process.
Traditional hiring assessments that ban modern tools are obsolete. A better approach is to give candidates access to AI tools and ask them to complete a complex task in an hour. This tests their ability to leverage technology for productivity, not their ability to memorize information.
For over a year, Mercor focused 100% of its resources on product and customer experience, forgoing a sales team. This deep focus on flagship customers in a tight-knit industry (AI labs) generated powerful word-of-mouth that fueled its historic growth.
The demand from AI labs for high-skilled professionals (engineers, lawyers, doctors) to create evals and training data created a historic business opportunity. Mercor capitalized on this by creating an expert labor marketplace, becoming the fastest-growing company in history.
Founders can waste time trying to force an initial idea. The key is to remain open-minded and identify where the market is surprisingly easy to sell into. Mercor found hypergrowth by pivoting from general hiring to serving the intense, specific needs of AI labs.
In a group of 100 experts training an AI, the top 10% will often drive the majority of the model's improvement. This creates a power law dynamic where the ability to source and identify this elite talent becomes a key competitive moat for AI labs and data providers.
Instead of fearing job loss, focus on skills in industries with elastic demand. When AI makes workers 10x more productive in these fields (e.g., software), the market will demand 100x more output, increasing the need for skilled humans who can leverage AI.
The frontier of AI training is moving beyond humans ranking model outputs (RLHF). Now, high-skilled experts create detailed success criteria (like rubrics or unit tests), which an AI then uses to provide feedback to the main model at scale, a process called RLAIF.
