Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Companies like Intercom and Cursor are proving that fine-tuning open-weight models on specific, "last-mile" user interaction data creates cheaper, faster, and more accurate models for vertical tasks (like customer service or coding) than general-purpose frontier models from labs like OpenAI.

Related Insights

While public benchmarks show general model improvement, they are almost orthogonal to enterprise adoption. Enterprises don't care about general capabilities; they need near-perfect precision on highly specific, internal workflows. This requires extensive fine-tuning and validation, not chasing leaderboard scores.

While horizontal chatbots handle general tasks well, they fail at the highly specific, high-stakes workflows of professionals like investment bankers. Startups can build defensible businesses by creating opinionated products that master the final 1-2% of a use case, which provides significant value and is too niche for large AI labs to pursue.

For specialized, high-stakes tasks like insurance underwriting, enterprises will favor smaller, on-prem models fine-tuned on proprietary data. These models can be faster, more accurate, and more secure than general-purpose frontier models, creating a lasting market for custom AI solutions.

The key for enterprises isn't integrating general AI like ChatGPT but creating "proprietary intelligence." This involves fine-tuning smaller, custom models on their unique internal data and workflows, creating a competitive moat that off-the-shelf solutions cannot replicate.

Specialized models like Cursor's Composer 2 can achieve short-term dominance over general frontier models by hyper-focusing on a specific domain like coding. This 'hill climbing' strategy allows them to beat larger models on cost-performance, even if general models are predicted to win long-term.

For most enterprise tasks, massive frontier models are overkill—a "bazooka to kill a fly." Smaller, domain-specific models are often more accurate for targeted use cases, significantly cheaper to run, and more secure. They focus on being the "best-in-class employee" for a specific task, not a generalist.

Instead of relying solely on massive, expensive, general-purpose LLMs, the trend is toward creating smaller, focused models trained on specific business data. These "niche" models are more cost-effective to run, less likely to hallucinate, and far more effective at performing specific, defined tasks for the enterprise.

Successful vertical AI applications serve as a critical intermediary between powerful foundation models and specific industries like healthcare or legal. Their core value lies in being a "translation and transformation layer," adapting generic AI capabilities to solve nuanced, industry-specific problems for large enterprises.

Coding assistant startup Cursor exemplifies a new AI playbook: start with a powerful open-weight base model (like China's Kimi), then apply significant reinforcement learning compute (3-4x the base model's) to achieve superior performance in a specific vertical. This strategy avoids the massive cost of pre-training a foundation model from scratch.

For specific, high-leverage tasks like conversation summarization and re-ranking search results, Intercom trains its own custom models. These smaller, fine-tuned models have proven to be cheaper, faster, and higher quality than using general-purpose frontier models from vendors.