In a world where semiconductor manufacturing is the ultimate bottleneck, the value of a GPU is highest the moment it's produced. The six-plus month delay required to test, launch, and reassemble a data center in space represents an immense opportunity cost, making it an impractical strategy for now.
The long-term ability to scale AI compute is not constrained by power or data centers, but by the production of advanced semiconductors. The ultimate chokepoint is ASML, the world's only manufacturer of EUV lithography tools, which can only produce just over 100 units annually by 2030.
The AI industry's massive demand for HBM memory is creating a severe shortage and price tripling for consumer DRAM. This will make devices like iPhones hundreds of dollars more expensive and is projected to cut the low and mid-range smartphone market in half as manufacturers cannot absorb the costs.
Power for AI data centers is not limited to the traditional grid or a few turbine suppliers. Operators are turning to a diverse portfolio of 'behind-the-meter' power sources, including repurposed jet engines (aeroderivatives), large reciprocating engines from ships and trucks, and fuel cells to rapidly scale capacity.
Before the 2019 US sanctions, Huawei was on a trajectory to dominate the AI hardware space. It had superior talent across the entire stack—from software and networking to AI research—and had already become TSMC's largest customer, indicating it would have likely outcompeted NVIDIA.
The manufacturing requirements for AI compute are staggering. Producing the advanced logic and memory wafers for just one gigawatt of data center capacity requires the output of approximately three and a half EUV lithography machines from ASML, representing over $1.2 billion in capital equipment.
By 2030, China is expected to have a fully indigenous DUV lithography supply chain, enabling mass production of chips on older process nodes. However, for the most advanced EUV technology, they will likely only have working prototypes and will still be struggling with the 'production hell' required for high-volume manufacturing.
For the next few years, the primary constraint on memory production is not a shortage of manufacturing equipment. Rather, it's the physical lack of clean room space. Memory companies, burned by years of low margins, failed to build new fabs, which have a two-year construction lead time.
A significant portion of hyperscalers' massive capital expenditures is allocated to long-lead-time items like data center construction and power agreements for capacity that will only come online in the next 3-5 years. This spending is a forward-looking indicator of their multi-year scaling plans.
In a significant strategic misstep, Google sold a large volume of its custom TPU accelerators to rival Anthropic. Immediately after, demand for Google's own Gemini model surged, leaving Google compute-constrained and trying to secure more capacity from a sold-out TSMC.
AI companies with the foresight to sign long-term, multi-year compute contracts gain a significant margin advantage. They lock in prices based on past valuations, while competitors are forced to buy capacity at much higher current market rates driven up by the increasing value of new AI models.
Contrary to typical hardware depreciation, GPUs like NVIDIA's H100 are becoming more valuable over time. This is because newer, more efficient AI models can generate significantly more output and value on the same hardware, tying the GPU's worth to its utility rather than its age.
Rapid revenue growth at AI labs like Anthropic creates an urgent need for massive amounts of inference compute. For instance, Anthropic's projected $60 billion revenue increase implies a need for an additional 4 gigawatts of inference capacity within 10 months, separate from R&D training fleets.
AI labs like Anthropic that were conservative in securing long-term compute now face a 'quality tax.' They must resort to lower-quality providers or pay significant markups and revenue-sharing deals for last-minute capacity, a cost their more aggressive competitors like OpenAI avoided by signing deals early.
AI workloads are limited by memory bandwidth, not capacity. While commodity DRAM offers more bits per wafer, its bandwidth is over an order of magnitude lower than specialized HBM. This speed difference would starve the GPU's compute cores, making the extra capacity useless and creating a massive performance bottleneck.
