An AI tool's inability to perform a task a month ago doesn't mean it can't today. The guest notes Copilot went from producing useless spreadsheet templates to fully functional models in months. Users should periodically re-test tools on previously failed tasks to leverage rapid, often unannounced, improvements.

Related Insights

When developing internal AI tools, adopt a 'fail fast' mantra. Many use cases fail not because the idea is bad, but because the underlying models aren't yet capable. It's critical to regularly revisit these failed projects, as rapid advancements in AI can quickly make a previously unfeasible idea viable.

An AI product's job is never done because user behavior evolves. As users become more comfortable with an AI system, they naturally start pushing its boundaries with more complex queries. This requires product teams to continuously go back and recalibrate the system to meet these new, unanticipated demands.

Users frequently write off an AI's ability to perform a task after a single failure. However, with models improving dramatically every few months, what was impossible yesterday may be trivial today. This "capability blindness" prevents users from unlocking new value.

Users mistakenly evaluate AI tools based on the quality of the first output. However, since 90% of the work is iterative, the superior tool is the one that handles a high volume of refinement prompts most effectively, not the one with the best initial result.

The essential skill for AI PMs is deep intuition, which can only be built through hands-on experimentation. This means actively using every new LLM, image, and video model upon release to objectively understand its capabilities, limitations, and trajectory, rather than relying on second-hand analysis.

When developing AI-powered tools, don't be constrained by current model limitations. Given the exponential improvement curve, design your product for the capabilities you anticipate models will have in six months. This ensures your product is perfectly timed to shine when the underlying tech catches up.

An AI tool's quality is now almost entirely dependent on its underlying model. The guest notes that 'Windsor', a top-tier agent just three weeks prior, dropped to 'C-tier' simply because it hadn't integrated Claude 4, highlighting the brutal pace of innovation.

Kevin Rose argues against forming fixed opinions on AI capabilities. The technology leapfrogs every 4-8 weeks, meaning a developer who found AI coding assistants "horrible" three months ago is judging a tool that is now 3-4 times better. One must continuously re-evaluate AI tools to stay current.

A significant source of competitive advantage ("alpha") comes from systematically testing various AI models for different tasks. This creates a personal map of which tools are best for specific use cases, ensuring you always use the optimal solution.

To stay on the cutting edge, maintain a list of complex tasks that current AI models can't perform well. Whenever a new model is released, run it against this suite. This practice provides an intuitive feel for the model's leap in capability and helps you identify when a previously impossible workflow becomes feasible.