Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Code-hosting platform Base44 launched its own fine-tuned model, Base1, not just to compete on performance but to control costs, latency, and reliability. This strategy leverages proprietary user data to create a defensible advantage that general-purpose frontier models cannot easily replicate, offering a playbook for other vertical platforms.

Related Insights

Startups can compete with large AI labs by capturing unique user interaction data from specialized workflows. This proprietary "user signal" enables post-training of models for specific tasks, creating a defensible advantage that labs, lacking that specific context, cannot easily replicate.

Companies like Intercom and Cursor are proving that fine-tuning open-weight models on specific, "last-mile" user interaction data creates cheaper, faster, and more accurate models for vertical tasks (like customer service or coding) than general-purpose frontier models from labs like OpenAI.

The notion of building a business as a 'thin wrapper' around a foundational model like GPT is flawed. Truly defensible AI products, like Cursor, build numerous specific, fine-tuned models to deeply understand a user's domain. This creates a data and performance moat that a generic model cannot easily replicate, much like Salesforce was more than just a 'thin wrapper' on a database.

The key for enterprises isn't integrating general AI like ChatGPT but creating "proprietary intelligence." This involves fine-tuning smaller, custom models on their unique internal data and workflows, creating a competitive moat that off-the-shelf solutions cannot replicate.

As AI application layers become easier to clone, the sustainable competitive advantage is moving down the tech stack. Companies with unique, last-mile user interaction data can build proprietary models that are cheaper and better, creating a data flywheel and a moat that is difficult for competitors to replicate.

As AI makes building software features trivial, the sustainable competitive advantage shifts to data. A true data moat uses proprietary customer interaction data to train AI models, creating a feedback loop that continuously improves the product faster than competitors.

Relying solely on expensive frontier models is unsustainable. Vertical AI companies must build a portfolio of smaller, specialized models that match frontier performance on specific tasks but cost 100x less, effectively allocating intelligence where it's needed most.

By training a smaller, specialized model where company data is in the weights, firms avoid the high token costs of repeatedly feeding context to large frontier models. This makes complex, data-intensive workflows significantly cheaper and faster.

Companies create defensibility by generating unique, non-public data through their operations (e.g., legal case outcomes). This proprietary data improves their own models, creating a feedback loop and a compounding advantage that large, generalist labs like OpenAI cannot replicate.

The vast majority of valuable data resides within private enterprises, unseen by foundation models. Companies can leverage this private data through continuous fine-tuning to create specialized, high-performing models, establishing a competitive advantage that API-based competitors cannot replicate.