Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Faced with rising costs from proprietary labs, sophisticated enterprise clients are building internal evaluation and routing systems. This allows them to use cheaper, open-source models for less complex tasks, optimizing for both cost and performance.

Related Insights

Don't use your most powerful and expensive AI model for every task. A crucial skill is model triage: using cheaper models for simple, routine tasks like monitoring and scheduling, while saving premium models for complex reasoning, judgment, and creative work.

While US firms lead in cutting-edge AI, the impressive quality of open-source models from China is compressing the market. As these free models improve, more tasks become "good enough" for open source, creating significant pricing pressure on premium, closed-source foundation models from companies like OpenAI and Google.

PMs often default to the most powerful, expensive models. However, comprehensive evaluations can prove that a significantly cheaper or smaller model can achieve the desired quality for a specific task, drastically reducing operational costs. The evals provide the confidence to make this trade-off.

Relying solely on premium models like Claude Opus can lead to unsustainable API costs ($1M/year projected). The solution is a hybrid approach: use powerful cloud models for complex tasks and cheaper, locally-hosted open-source models for routine operations.

Enterprises will shift from relying on a single large language model to using orchestration platforms. These platforms will allow them to 'hot swap' various models—including smaller, specialized ones—for different tasks within a single system, optimizing for performance, cost, and use case without being locked into one provider.

To optimize costs, users configure powerful models like Claude Opus as the 'brain' to strategize and delegate execution tasks (e.g. coding) to cheaper, specialized models like ChatGPT's Codec, treating them as muscles.

As enterprises scale AI, the high inference costs of frontier models become prohibitive. The strategic trend is to use large models for novel tasks, then shift 90% of recurring, common workloads to specialized, cost-effective Small Language Models (SLMs). This architectural shift dramatically improves both speed and cost.

Open source AI models don't need to become the dominant platform to fundamentally alter the market. Their existence alone acts as a powerful price compressor. Proprietary model providers are forced to lower their prices to match the inference cost of open-source alternatives, squeezing profit margins and shifting value to other parts of the stack.

Misha Laskin, CEO of Reflection AI, states that large enterprises turn to open source models for two key reasons: to dramatically reduce the cost of high-volume tasks, or to fine-tune performance on niche data where closed models are weak.

As foundational AI models become commoditized 'intelligence utilities,' the economic value moves up the stack. Orchestrators like OpenClaw, which can intelligently route tasks to the most efficient model based on cost or use case, are positioned to capture the margin that the underlying model providers cannot.