We scan new podcasts and send you the top 5 insights daily.
The widely-circulated chart showing a drop in the "LLM Token Expenditure Index" doesn't reflect a decline in AI demand. It merely shows a drop in the average price paid for tokens, and its data is sourced exclusively from third-party routers designed to find cheaper options, thus skewing the results.
While the cost-per-token is decreasing as models become more efficient, this efficiency gain drives a massive increase in new use cases and overall consumption. This economic principle, Jevons Paradox, explains why total enterprise spending on model inference is skyrocketing, even as the unit cost falls.
Doug from Semi Analysis argues that the primary deflationary threat isn't just cheaper tokens, but the emergence of low-end models that can commoditize entire AI-powered solutions, creating a race to the bottom that erodes pricing power for everyone.
Valuations of AI companies may be artificially low because they're based on the token demand for simple chatbots. The real, explosive growth comes from reasoning models, agents, and multimodal generation, creating a near-infinite demand for tokens that is not yet priced in.
The narrative of "off the charts" AI demand is misleading. Major AI providers like OpenAI are "burning tens of billions of dollars," indicating they are not charging the true cost for their services. A realistic picture of demand will only emerge once they are forced to price for profitability, which could significantly cool the market.
Despite enterprises hitting AI budget limits, the market is not collapsing. Competition is forcing AI providers to lower token prices, triggering the Jevons paradox: as a resource's cost falls, its consumption increases, sustaining demand for underlying infrastructure like NVIDIA chips.
The hedge fund Citadel Securities observes that the AI market is splitting. After initial enthusiasm, companies are now facing the reality of high token costs and compute constraints, causing a shift away from expensive frontier models toward simpler, more cost-effective AI that offers clearer ROI.
A model with a low per-token price can be more expensive if it's inefficient, verbose, or requires multiple attempts ('overthinking'). The actual invoice depends on the total tokens needed to complete a task, making token efficiency a hidden multiplier that savvy enterprises are now tracking to determine the true cost.
Every summer, a narrative emerges that AI progress is stalling or a bubble is bursting. Past panics focused on user drop-offs (2023) or training data limits (2024). This year's version is driven by the end of subsidized token usage, creating a predictable cycle of doubt that historically dissipates with new breakthroughs.
The current affordability of AI tokens is not sustainable; it's propped up by venture capital funding AI companies operating at a loss. Businesses should treat this as a temporary window for aggressive learning and experimentation before prices inevitably rise to reflect true operational costs.
The AI market has two opposing trends: a dramatic collapse in token prices for equivalent models (down 150x in 21 months) and unprecedented revenue growth. This indicates that the explosion in utilization and value creation is massively outpacing cost reductions, signaling a healthy, expanding market.