To combat the GPU shortage, top VC firms are bundling their portfolio companies' compute needs. They negotiate with cloud providers on behalf of their startups, acting as a single large customer to get better pricing and access, a novel role for investors.
Once a haven for startups struggling to get GPUs, NeoClouds like CoreWeave have shifted their strategy. They now prioritize serving the largest customers, mirroring the behavior of AWS and Azure and leaving startups with fewer alternative compute options than in 2023.
There's an inverse correlation between an AI lab's model performance and its marketing focus. When a lab is in a "downswing" between model releases or lagging on benchmarks, it shifts PR to product capabilities and vertical applications instead of raw performance.
Microsoft Azure imposes a harsh "use-it-or-lose-it" policy on GPU clusters for smaller customers. Even a few hours of underutilization can result in being kicked off and placed at the back of a months-long waiting list, creating major instability for startups.
The metric for evaluating AI models is shifting. Early on, maximum quality was paramount for adoption. Now, sophisticated users are focusing on efficiency, evaluating models based on "quality per dollar spent," making cost-effectiveness a key competitive advantage.
Contrary to expectations of easing supply, the GPU shortage has intensified since 2023. With clearer AI business models, mega-customers like OpenAI and Anthropic are spending even more aggressively, creating a fierce bidding war that pushes startups out.
Contrary to public messaging about cost-cutting, past tech layoffs were often a headcount shuffle. Companies like Google quickly rehired, ending up with larger workforces. They were replacing generalists with specialized, expensive AI talent.
