A model with a low per-token price can be more expensive if it's inefficient, verbose, or requires multiple attempts ('overthinking'). The actual invoice depends on the total tokens needed to complete a task, making token efficiency a hidden multiplier that savvy enterprises are now tracking to determine the true cost.
The key metric for winning the AI race is shifting from pure benchmark scores to efficiency. Perplexity's CEO argues that the company providing the most "token value per watt per user"—balancing accuracy, latency, cost, and intelligence—will ultimately dominate the market, making efficient intelligence the new goal.
Legal AI firm Harvey proved a hybrid system—using a smaller model as a primary worker and routing selectively to a frontier model as an "advisor"—can beat a frontier-only approach on both quality and cost. This demonstrates that intelligent orchestration is a more effective strategy than simply using the most powerful model for every task.
As enterprises become more cost-conscious about token spend, they are actively seeking cheaper alternatives to OpenAI and Anthropic. Data from Ramp shows China's DeepSeek is the top trending software vendor, indicating a new willingness to use foreign or open-source models despite potential data privacy concerns.
Despite headlines using "enterprise" language, Meta's new business agent is strategically aimed at the massive global market of small businesses (e.g., bakeries, local shops) already using WhatsApp. The value proposition is not complex integration but iPhone-like simplicity for business owners too busy to become AI experts.
Cloudflare data reveals that bots and AI agents now constitute 57.5% of web traffic, surpassing human traffic for the first time. This milestone, which CEO Matthew Prince predicted wouldn't happen until 2027, has significant implications for website ad revenue, infrastructure, and the rise of malicious automated activity online.
