Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Microsoft's Copilot platform doesn't rely on a single foundation model. It automatically routes user tasks to different models based on what works best for the job—using OpenAI for interactive chat but switching to Claude for long-running, tool-using background tasks.

Related Insights

The true power of the AI application layer lies in orchestrating multiple, specialized foundation models. Users want a single interface (like Cursor for coding) that intelligently routes tasks to the best model (e.g., Gemini for front-end, Codex for back-end), creating value through aggregation and workflow integration.

Use a highly intelligent model like Opus for high-level planning and a more diligent, execution-focused model like a GPT-Codex variant for implementation. This 'best of both worlds' approach within a model-agnostic harness leads to superior results compared to relying on a single model for all tasks.

Instead of relying on a single AI, use different models (e.g., ChatGPT for internal context, Claude for an objective view) for the same problem. This multi-model approach generates diverse perspectives and higher-quality strategic outputs.

Microsoft is not solely reliant on its OpenAI partnership. It actively integrates competitor models, such as Anthropic's, into its Copilot products to handle specific workloads where they perform better, like complex Excel tasks. This pragmatic "best tool for the job" approach diversifies its AI capabilities.

Rather than committing to a single LLM provider like OpenAI or Gemini, Hux uses multiple commercial models. They've found that different models excel at different tasks within their app. This multi-model strategy allows them to optimize for quality and latency on a per-workflow basis, avoiding a one-size-fits-all compromise.

Enterprises will shift from relying on a single large language model to using orchestration platforms. These platforms will allow them to 'hot swap' various models—including smaller, specialized ones—for different tasks within a single system, optimizing for performance, cost, and use case without being locked into one provider.

The belief that a single, god-level foundation model would dominate has proven false. Horowitz points to successful AI applications like Cursor, which uses 13 different models. This shows that value lies in the complex orchestration and design at the application layer, not just in having the largest single model.

To optimize costs, users configure powerful models like Claude Opus as the 'brain' to strategize and delegate execution tasks (e.g. coding) to cheaper, specialized models like ChatGPT's Codec, treating them as muscles.

A hybrid approach to AI agent architecture is emerging. Use the most powerful, expensive cloud models like Claude for high-level reasoning and planning (the "CEO"). Then, delegate repetitive, high-volume execution tasks to cheaper, locally-run models (the "line workers").

Powerful AI tools are becoming aggregators like Manus, which intelligently select the best underlying model for a specific task—research, data visualization, or coding. This multi-model approach enables a seamless workflow within a single thread, outperforming systems reliant on one general-purpose model.