To stay on the cutting edge, maintain a list of complex tasks that current AI models can't perform well. Whenever a new model is released, run it against this suite. This practice provides an intuitive feel for the model's leap in capability and helps you identify when a previously impossible workflow becomes feasible.
AI's capabilities are inconsistent; it excels at some tasks and fails surprisingly at others. This is the 'jagged frontier.' You can only discover where AI is useful and where it's useless by applying it directly to your own work, as you are the only one who can accurately judge its performance in your domain.
Building an AI-native product requires betting on the trajectory of model improvement, much like developers once bet on Moore's Law. Instead of designing for today's LLM constraints, assume rapid progress and build for the capabilities that will exist tomorrow. This prevents creating an application that is quickly outdated.
Users frequently write off an AI's ability to perform a task after a single failure. However, with models improving dramatically every few months, what was impossible yesterday may be trivial today. This "capability blindness" prevents users from unlocking new value.
The goal of testing multiple AI models isn't to crown a universal winner, but to build your own subjective "rule of thumb" for which model works best for the specific tasks you frequently perform. This personal topography is more valuable than any generic benchmark.
The essential skill for AI PMs is deep intuition, which can only be built through hands-on experimentation. This means actively using every new LLM, image, and video model upon release to objectively understand its capabilities, limitations, and trajectory, rather than relying on second-hand analysis.
The primary bottleneck in improving AI is no longer data or compute, but the creation of 'evals'—tests that measure a model's capabilities. These evals act as product requirement documents (PRDs) for researchers, defining what success looks like and guiding the training process.
When developing AI-powered tools, don't be constrained by current model limitations. Given the exponential improvement curve, design your product for the capabilities you anticipate models will have in six months. This ensures your product is perfectly timed to shine when the underlying tech catches up.
Instead of guessing where AI can help, use AI itself as a consultant. Detail your daily workflows, tasks, and existing tools in a prompt, and ask it to generate an "opportunity map." This meta-approach lets AI identify the highest-impact areas for its own implementation.
A significant source of competitive advantage ("alpha") comes from systematically testing various AI models for different tasks. This creates a personal map of which tools are best for specific use cases, ensuring you always use the optimal solution.
Instead of waiting for external reports, companies should develop their own AI model evaluations. By defining key tasks for specific roles and testing new models against them with standard prompts, businesses can create a relevant, internal benchmark.