The current model of paying per AI token is a temporary phase. Drawing a parallel to computing history, any resource constraint that requires payment eventually moves to the user's local device and becomes free. On-device AI processing will follow this pattern, ultimately eliminating token costs.
The initial strategy for the Surface was to force a "platform discontinuity" by moving to ARM chips and a mobile form factor, leaving behind legacy issues. The Intel x86 version was merely an "objection handler" for compatibility concerns. Microsoft later abandoned this forward-looking vision for a backward-compatible model.
Former Windows President Steven Sinofsky argues consumers unknowingly want PCs without legacy baggage like editable registries and system vulnerabilities. Microsoft's focus on running old apps on new ARM chips preserves problems that Apple solved, hindering the PC's evolution into a modern, sealed device.
Steven Sinofsky, having lived through a half-dozen component shortages, advises against making long-term strategic decisions based on temporary supply constraints. These "local max or min" situations inevitably correct themselves, and concerns over memory for AI devices will be resolved by both supply and software optimization.
For decades, NVIDIA was an "add-on" to the PC ecosystem, requiring separate drivers and coexisting with official OS graphics APIs like Microsoft's DirectX. Its new position at the core of AI PCs with its CUDA stack represents a fundamental shift, challenging the traditional OS-centric control held by Microsoft and Apple.
