The true exponential acceleration towards AGI is currently limited by a human bottleneck: our speed at prompting AI and, more importantly, our capacity to manually validate its work. The hockey stick growth will only begin when AI can reliably validate its own output, closing the productivity loop.
As AI coding agents generate vast amounts of code, the most tedious part of a developer's job shifts from writing code to reviewing it. This creates a new product opportunity: building tools that help developers validate and build confidence in AI-written code, making the review process less of a chore.
The primary obstacle for tools like OpenAI's Atlas isn't technical capability but the user's workload. The time, effort, and security risk required to verify an AI agent's autonomous actions often exceed the time it would take for a human to perform the task themselves, limiting practical use cases.
AI is not a 'set and forget' solution. An agent's effectiveness directly correlates with the amount of time humans invest in training, iteration, and providing fresh context. Performance will ebb and flow with human oversight, with the best results coming from consistent, hands-on management.
Users mistakenly evaluate AI tools based on the quality of the first output. However, since 90% of the work is iterative, the superior tool is the one that handles a high volume of refinement prompts most effectively, not the one with the best initial result.
AI can produce scientific claims and codebases thousands of times faster than humans. However, the meticulous work of validating these outputs remains a human task. This growing gap between generation and verification could create a backlog of unproven ideas, slowing true scientific advancement.
A useful mental model for AGI is child development. Just as a child can be left unsupervised for progressively longer periods, AI agents are seeing their autonomous runtimes increase. AGI arrives when it becomes economically profitable to let an AI work continuously without supervision, much like an independent adult.
The ultimate goal for leading labs isn't just creating AGI, but automating the process of AI research itself. By replacing human researchers with millions of "AI researchers," they aim to trigger a "fast takeoff" or recursive self-improvement. This makes automating high-level programming a key strategic milestone.
The perceived limits of today's AI are not inherent to the models themselves but to our failure to build the right "agentic scaffold" around them. There's a "model capability overhang" where much more potential can be unlocked with better prompting, context engineering, and tool integrations.
Advanced AI tools like "deep research" models can produce vast amounts of information, like 30-page reports, in minutes. This creates a new productivity paradox: the AI's output capacity far exceeds a human's finite ability to verify sources, apply critical thought, and transform the raw output into authentic, usable insights.
While AI models excel at gathering and synthesizing information ('knowing'), they are not yet reliable at executing actions in the real world ('doing'). True agentic systems require bridging this gap by adding crucial layers of validation and human intervention to ensure tasks are performed correctly and safely.