We scan new podcasts and send you the top 5 insights daily.
To encourage creativity, Goldman uses a central 'Model Gateway' to intelligently route queries to the most cost-effective AI model. This strategy isolates users from 'token anxiety'—the fear of consuming expensive resources—and allows a central team to optimize costs without stifling innovation.
A single AI model is insufficient for running a complex company. An orchestration layer allows you to assign different models (e.g., a powerful frontier model for the CEO, cheaper models for routine tasks) based on their unique "personalities" and cost-effectiveness.
Recognizing there is no single "best" LLM, AlphaSense built a system to test and deploy various models for different tasks. This allows them to optimize for performance and even stylistic preferences, using different models for their buy-side finance clients versus their corporate users.
To ensure governance and avoid redundancy, Experian centralizes AI development. This approach treats AI as a core platform capability, allowing for the reuse of models and consistent application of standards across its global operations.
Don't use your most powerful and expensive AI model for every task. A crucial skill is model triage: using cheaper models for simple, routine tasks like monitoring and scheduling, while saving premium models for complex reasoning, judgment, and creative work.
Goldman's CIO notes AI has dramatically reduced the cost and time to create internal applications. This is causing a strategic shift back toward building software in-house, especially for smaller tools, leading to the termination of some third-party vendor contracts.
Samsara built a central endpoint that abstracts away complexities of using different LLMs like OpenAI or Gemini. This gateway handles cost, security, and compliance, allowing any product engineer to quickly build and deploy AI features without specialized expertise.
To foster breakthrough ideas, companies should initially provide engineers with unrestricted access to the most powerful AI models, ignoring costs. Optimization should only happen after an idea proves its value at scale, as early cost-cutting stifles creativity.
Enterprises will shift from relying on a single large language model to using orchestration platforms. These platforms will allow them to 'hot swap' various models—including smaller, specialized ones—for different tasks within a single system, optimizing for performance, cost, and use case without being locked into one provider.
Pega's CTO advises using the powerful reasoning of LLMs to design processes and marketing offers. However, at runtime, switch to faster, cheaper, and more consistent predictive models. This avoids the unpredictability, cost, and risk of calling expensive LLMs for every live customer interaction.
To optimize costs, users configure powerful models like Claude Opus as the 'brain' to strategize and delegate execution tasks (e.g. coding) to cheaper, specialized models like ChatGPT's Codec, treating them as muscles.