We scan new podcasts and send you the top 5 insights daily.
To profitably handle over 100 billion weekly LLM calls, Superhuman primarily uses its own fine-tuned versions of open-source models like Llama and Gemma. This in-house infrastructure allows it to operate at an 85%+ gross margin, a stark contrast to companies reliant on costly third-party APIs.
Faced with rising costs from proprietary labs, sophisticated enterprise clients are building internal evaluation and routing systems. This allows them to use cheaper, open-source models for less complex tasks, optimizing for both cost and performance.
a16z isn't deterred by AI companies' 0-50% gross margins, a stark contrast to the usual 70% software benchmark. They accept these margins if they stem from LLM costs, focusing instead on whether the company is building defensible value through unique data, workflows, and integrations.
The top 1% of AI companies making significant revenue don't rely on popular frameworks like Langchain. They gain more control and performance by using small, direct LLM calls for specific application parts. This avoids the black-box abstractions of frameworks, which are more common among the other 99% of builders.
Though leading closed-source models are marginally superior, open-source alternatives provide a much better price-to-performance ratio. Users pay a steep premium for the last few percentage points of intelligence offered by proprietary models, making open source a highly cost-effective choice for many applications.
Contrary to past momentum, the most advanced AI startups are increasingly adopting and fine-tuning open-source models. This shift is driven by the need for cost-effective speed and deep customization as their workloads mature and scale.
Parser's AI costs are lower than its server costs. They achieve this by intentionally avoiding the most powerful, expensive LLMs which are often slow and rate-limited. Instead, they find a balance, prioritizing speed and cost-effectiveness to process high volumes affordably.
By training a smaller, specialized model where company data is in the weights, firms avoid the high token costs of repeatedly feeding context to large frontier models. This makes complex, data-intensive workflows significantly cheaper and faster.
At scale, companies rarely deploy open-source models "off the shelf." Instead, virtually all production workloads involve custom modifications. This can be post-training with proprietary data to improve quality or compiling and quantizing the model to enhance performance and reduce cost.
To escape platform risk and high API costs, startups are building their own AI models. The strategy involves taking powerful, state-subsidized open-source models from China and fine-tuning them for specific use cases, creating a competitive alternative to relying on APIs from OpenAI or Anthropic.
Traditional SaaS metrics like 80%+ gross margins are misleading for AI companies. High inference costs lower margins, but if the absolute gross profit per customer is multiples higher than a SaaS equivalent, it's a superior business. The focus should shift from margin percentages to absolute gross profit dollars and multiples.