Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

The greatest value in AI won't be captured by frontier labs alone. Instead, companies in the "applied layer" are incentivized to build routing systems that use expensive frontier models for high-level orchestration while deploying cheaper open-source models for bulk tasks, creating a more efficient, barbell-shaped cost structure.

Related Insights

Faced with rising costs from proprietary labs, sophisticated enterprise clients are building internal evaluation and routing systems. This allows them to use cheaper, open-source models for less complex tasks, optimizing for both cost and performance.

Enterprises are currently overspending on tokens by sending all queries to the most powerful LLMs. A new software category will emerge to intelligently route requests to smaller, cheaper models when possible, creating a critical efficiency and cost-saving layer between companies and foundational model providers.

Foundational AI models will commoditize into a utility layer where companies buy "intelligence on the fly." The real, sustainable profit will be captured by application companies that leverage various models to solve specific business problems, as most enterprises lack the expertise to use raw models effectively.

As customers increasingly adopt model orchestration—routing tasks to the most efficient model for the job—value shifts away from individual frontier models. This trend commoditizes the raw intelligence layer, posing a significant threat to companies focused solely on building the largest models.

Instead of relying on one powerful model for all tasks, the leading strategy is 'smart routing'—using a panel of models and directing each task to the most appropriate one. This compound architecture demonstrably beats single frontier models on both cost and performance.

Relying solely on expensive frontier models is unsustainable. Vertical AI companies must build a portfolio of smaller, specialized models that match frontier performance on specific tasks but cost 100x less, effectively allocating intelligence where it's needed most.

AI21 exemplifies a winning AI business model: give away the foundational model (Jamba) to drive adoption, then monetize a proprietary orchestration layer (Maestro) that helps enterprises manage multiple models for cost and performance, capturing value higher up the stack.

An intelligent AI orchestration layer can achieve a cost-to-accuracy balance superior to any single model. By routing queries to a portfolio of different models (large, small, specialized), it creates a new Pareto frontier, delivering higher success rates at a lower average cost than relying on one "best" model.

Companies are building intelligent systems that analyze a user's prompt and automatically route it to the most cost-effective model that can handle the task. This avoids using expensive frontier models for simple requests, with some companies like Coinbase successfully keeping costs flat despite exponential usage growth.

As foundational AI models become commoditized 'intelligence utilities,' the economic value moves up the stack. Orchestrators like OpenClaw, which can intelligently route tasks to the most efficient model based on cost or use case, are positioned to capture the margin that the underlying model providers cannot.