Over two-thirds of reasoning models' performance gains came from massively increasing their 'thinking time' (inference scaling). This was a one-time jump from a zero baseline. Further gains are prohibitively expensive due to compute limitations, meaning this is not a repeatable source of progress.
Hopes that AI's new reasoning skills in checkable domains like math and code would generalize to ambiguous, real-world tasks like booking a flight did not materialize. This failure of 'reasoning generalization' was a major technical roadblock that forced experts to lengthen AGI timelines.
The AI industry's exponential growth in consuming compute, electricity, and talent is unsustainable. By 2032, it will have absorbed most available slack from other industries. Further progress will require potentially un-fundable trillion-dollar training runs, creating a critical period for AGI development.
A growing gap exists between AI's performance in demos and its actual impact on productivity. As podcaster Dwarkesh Patel noted, AI models improve at the rapid rate short-term optimists predict, but only become useful at the slower rate long-term skeptics predict, explaining widespread disillusionment.
Even if AI perfects software engineering, automating AI R&D will be limited by non-coding tasks, as AI companies aren't just software engineers. Furthermore, AI assistance might only be enough to maintain the current rate of progress as 'low-hanging fruit' disappears, rather than accelerate it.
While cutting-edge AI is extremely expensive, its cost drops dramatically fast. A reasoning benchmark that cost OpenAI $4,500 per question in late 2024 cost only $11 a year later. This steep deflation curve means even the most advanced capabilities quickly become accessible to the mass market.
