Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Modal Labs provides an infrastructure layer that sits above hyperscalers and specialized AI clouds. Its value is not owning hardware but abstracting the complexity of managing raw GPU capacity. By offering a superior developer experience and a flexible, usage-based model, it solves the variable demand problem inherent in AI applications.

Related Insights

A new category of "NeoCloud" or "AI-native cloud" is rising, focusing specifically on AI training and inference. Unlike general-purpose clouds like AWS, these platforms are GPU-first, catering to massive AI workloads and addressing the GPU scarcity and different workload patterns found in hyperscalers.

AI applications often have long waiting periods for model responses or user input, but traditional cloud platforms charge for this idle time. Vercel's "Fluid Compute" is designed so customers only pay when the application is actively processing, making it fundamentally more cost-effective for AI workloads.

AI Infrastructure (AI Infra) solves problems unique to AI/ML, such as managing compute-heavy, GPU-dependent workloads. This marks a shift from traditional infrastructure, which was often more focused on data input/output rather than intensive computation.

CoreWeave argues that large tech companies aren't just using them to de-risk massive capital outlays. Instead, they are buying a superior, purpose-built product. CoreWeave’s infrastructure is optimized from the ground up for parallelized AI workloads, a fundamental shift from traditional cloud architecture.

Hardware vendors like NVIDIA (CUDA) and AMD create fragmented, proprietary software stacks that lock developers in. Modular builds a replacement layer that enables AI models to run consistently across different hardware, giving enterprises choice and flexibility without rewriting code.

Providing GPUs-as-a-Service is not a durable business because customers can easily switch providers. The key to customer retention and high net dollar retention (NDR) is the software layer built on top of the hardware. This software, which handles the complexities of inference, creates the actual stickiness.

In the crowded GPU reseller market, startups like Modal justify high valuations by offering more than just compute. A key driver of Modal's growth is its 'Sandboxes' product, a specialized software layer for safely running AI agents, demonstrating that value is moving from raw infrastructure to agent-specific tooling.

A new category of cloud providers, "NeoClouds," are built specifically for high-performance GPU workloads. Unlike traditional clouds like AWS, which were retrofitted from a CPU-centric architecture, NeoClouds offer superior performance for AI tasks by design and through direct collaboration with hardware vendors like NVIDIA.

Big tech companies are offering their most advanced AI models via a "tokens by the drink" pricing model. This is incredible for startups, as it provides access to the world's most magical technology on a usage basis, allowing them to get started and scale without massive upfront capital investment.

By renting its excess GPU capacity to startup Cursor, xAI is pioneering a new business model. This turns companies with massive, proprietary AI infrastructure into de facto cloud providers for others that have high demand but lack hardware, offsetting huge infrastructure costs and fostering strategic data partnerships.