As models mature, their core differentiator will become their underlying personality and values, shaped by their creators' objective functions. One model might optimize for user productivity by being concise, while another optimizes for engagement by being verbose.
The best AI models are trained on data that reflects deep, subjective qualities—not just simple criteria. This "taste" is a key differentiator, influencing everything from code generation to creative writing, and is shaped by the values of the frontier lab.
The term "data labeling" minimizes the complexity of AI training. A better analogy is "raising a child," as the process involves teaching values, creativity, and nuanced judgment. This reframe highlights the deep responsibility of shaping the "objective functions" for future AI.
Labs are incentivized to climb leaderboards like LM Arena, which reward flashy, engaging, but often inaccurate responses. This focus on "dopamine instead of truth" creates models optimized for tabloids, not for advancing humanity by solving hard problems.
Instead of chasing trends or pivoting every few weeks, founders should focus on a singular mission that stems from their unique expertise and conviction. This approach builds durable, meaningful companies rather than simply chasing valuations.
Surge AI intentionally avoided VC funding and the "Silicon Valley game" of hype and fundraising. This forced them to build a 10x better product that grew via word-of-mouth, attracting customers who genuinely valued data quality instead of hype.
Beyond supervised fine-tuning (SFT) and human feedback (RLHF), reinforcement learning (RL) in simulated environments is the next evolution. These "playgrounds" teach models to handle messy, multi-step, real-world tasks where current models often fail catastrophically.
Drawing from experience at big tech, Surge AI's founder believes large organizations slow down top performers with distractions. By building a super-small, elite team, companies can achieve more with less overhead, a principle proven by Surge's own success.
Don't trust academic benchmarks. Labs often "hill climb" or game them for marketing purposes, which doesn't translate to real-world capability. Furthermore, many of these benchmarks contain incorrect answers and messy data, making them an unreliable measure of true AI advancement.
