We scan new podcasts and send you the top 5 insights daily.
The initial approach to AI adoption was often "token maxing"—using as many tokens as possible under the assumption that more usage equals more value. A more sophisticated and sustainable strategy is "output maxing," which focuses on achieving the desired result while actively minimizing token consumption and cost.
The trend of "token maxing"—unrestrained spending on AI usage—is being corrected. Companies like Meta are realizing that, like any business expense, AI token consumption must be "min-maxed": optimizing for the highest leverage output at the lowest possible cost, not just maximizing usage.
The AI industry has shifted from a subsidized model to a "token shortage" era. This forces all companies, from AI providers to enterprise users like Uber, to prioritize cost-effective usage. Business models are now usage-based, making architectural and financial efficiency paramount.
According to Mike Cannon-Brookes, advanced enterprises are not tracking AI success by counting tokens. Instead, they are asking harder questions about overall output, such as engineering productivity and quality. They understand that high token usage doesn't always correlate with high productivity, shifting focus from raw usage to tangible business outcomes.
In response to budget blowouts from agentic AI, enterprises are moving beyond simple adoption to active cost management. A new "token efficiency" stack is emerging, featuring tactics like model routing to cheaper alternatives (e.g., DeepSeek) and custom post-trained models to reduce reliance on expensive foundation models.
Companies initially gamified AI use, leading to a "token maxing" culture. Now, facing enormous, unexpected bills, they are experiencing "sticker shock." This is forcing a strategic shift from encouraging maximum usage to demanding ROI calculations and finding the most cost-effective AI model for a given task.
Paralleling the cloud adoption curve, the current surge in AI spending will inevitably be followed by an 'optimization point.' Enterprises will shift from experimentation to efficiency, scrutinizing token usage and seeking to reduce costs, forcing AI providers to help them optimize.
Tech companies are shifting from a 'token maxing' mindset—using AI tools indiscriminately—to 'token min-maxing.' This borrows from gaming strategy, focusing on achieving the highest output for the lowest resource cost. It marks a maturation from hype-driven consumption to a more structured, ROI-focused approach with budgets and controls.
Simple leaderboards tracking token usage lead to 'token maxing'—engineers burning tokens to look productive. A better approach is to use hack days and demos to reward and showcase high-impact output, which implicitly encourages effective AI use.
The metric for evaluating AI models is shifting. Early on, maximum quality was paramount for adoption. Now, sophisticated users are focusing on efficiency, evaluating models based on "quality per dollar spent," making cost-effectiveness a key competitive advantage.
To control inference costs, companies are implementing model routing systems. They differentiate between expensive tokens from frontier models for complex reasoning and cheaper tokens from fine-tuned open-source models for simpler workflow tasks. This tiered approach optimizes both performance and budget, avoiding "token maxing."