For companies at the trillion-token scale, cost predictability is more important than the lowest per-token price. Superhuman favors providers offering fixed-capacity pricing, giving them better control over their cost structure, which is crucial for pre-IPO financial planning.

Related Insights

Many AI startups are "wrappers" whose service cost is tied to an upstream LLM. Since LLM prices fluctuate, these startups risk underwater unit economics. Stripe's token billing API allows them to track and price their service based on real-time inference costs, protecting their margins from volatility.

Standard SaaS pricing fails for agentic products because high usage becomes a cost center. Avoid the trap of profiting from non-use. Instead, implement a hybrid model with a fixed base and usage-based overages, or, ideally, tie pricing directly to measurable outcomes generated by the AI.

AI companies operate under the assumption that LLM prices will trend towards zero. This strategic bet means they intentionally de-prioritize heavy investment in cost optimization today, focusing instead on capturing the market and building features, confident that future, cheaper models will solve their margin problems for them.

In a crowded market where startups offer free or heavily subsidized AI tokens to gain users, Vercel intentionally prices its tokens at cost. They reject undercutting the market, betting instead that a superior, higher-quality product will win customers willing to pay for value.

Pega's CTO advises using the powerful reasoning of LLMs to design processes and marketing offers. However, at runtime, switch to faster, cheaper, and more consistent predictive models. This avoids the unpredictability, cost, and risk of calling expensive LLMs for every live customer interaction.