Don't Outsource AI Error Analysis; It’s How PMs Build a Product's Moat

Related Insights

Turn Qualitative AI Failures into Quantitative Priorities via Error Analysis

Systematically review production traces ("open coding"), categorize the observed errors ("axial coding"), and then count them. This simple process transforms subjective "vibe checks" and messy logs into a prioritized, data-backed roadmap for improving your AI application, giving PMs a superpower.

How to Do AI Evals Step-by-Step with Real Production Data | Tutorial by Hamel Husain and Shreya Shankar

The Growth Podcast·a month ago

Appoint a "Benevolent Dictator" to Lead AI Eval Coding and Avoid Committee Paralysis

When conducting manual "open coding" for AI evals, teams often get bogged down by trying to reach a consensus. Instead, appoint a single person with deep domain expertise (often the product manager) to be the "benevolent dictator," making the final judgment calls on error categorization. This makes the process tractable and fast.

Why AI evals are the hottest new skill for product builders | Hamel Husain & Shreya Shankar (creators of the #1 eval course)

Lenny's Podcast: Product | Career | Growth·5 months ago

Hands-on Coding with AI Reveals Its Enthusiastic But Repetitive Incompetence

Product leaders must personally engage with AI development. Direct experience reveals unique, non-human failure modes. Unlike a human developer who learns from mistakes, an AI can cheerfully and repeatedly make the same error—a critical insight for managing AI projects and team workflow.

Making AI Work for Product Teams

Product Rebels·4 months ago

Stop Writing Tests First; Effective AI Evals Begin with Manual Error Analysis of User Logs

The common mistake in building AI evals is jumping straight to writing automated tests. The correct first step is a manual process called "error analysis" or "open coding," where a product expert reviews real user interaction logs to understand what's actually going wrong. This grounds your entire evaluation process in reality.

Why AI evals are the hottest new skill for product builders | Hamel Husain & Shreya Shankar (creators of the #1 eval course)

Lenny's Podcast: Product | Career | Growth·5 months ago

Your PM, Not Engineer, Is Uniquely Qualified to Write AI Evaluation Criteria

Because PMs deeply understand the customer's job, needs, and alternatives, they are the only ones qualified to write the evaluation criteria for what a successful AI output looks like. This critical task goes beyond technical metrics and is core to the PM's role in the AI era.

She went from IC PM to CEO of $550M AI company Descript in 3 years

The Growth Podcast·2 months ago

AI Product Teams Must Analyze Raw, Messy User Inputs, Not Just Clean Test Prompts

Developers often test AI systems with well-formed, correctly spelled questions. However, real users submit vague, typo-ridden, and ambiguous prompts. Directly analyzing these raw logs is the most crucial first step to understanding how your product fails in the real world and where to focus quality improvements.

Evals, error analysis, and better prompts: A systematic approach to improving your AI products | Hamel Husain (ML engineer)

How I AI·4 months ago

PMs Must Understand AI Trace Analysis for Collaboration, But Cede Ownership to Engineering

In AI development, trace analysis is a point of tension. Product Managers should become fluent enough to ask intelligent questions and participate in debugging. However, they should avoid owning the process or tooling, respecting it as engineering's domain to maintain a healthy division of labor.

How to Upskill from Core PM to Great AI PM: Masterclass from Pendo CEO Todd Olson

Product Growth Podcast·3 months ago

Product Leaders Who Delegate AI Exploration Cannot Lead Their Teams Effectively

While senior leaders are trained to delegate execution, AI is an exception. Direct, hands-on use is non-negotiable for leadership. It demystifies the technology, reveals its counterintuitive flaws, and builds the empathy required to understand team challenges. Leaders who remain hands-off will be unable to guide strategy effectively.

Practical AI in Product

Product Rebels·2 months ago

LLMs Miss Critical Nuance; AI Evals Require a Product Manager's Context

AI tools like ChatGPT can analyze traces for basic correctness but miss subtle product experience failures. A product manager's contextual knowledge is essential to identify issues like improper formatting for a specific channel (e.g., markdown in SMS) or failures in user experience that an LLM would deem acceptable.

How to Do AI Evals Step-by-Step with Real Production Data | Tutorial by Hamel Husain and Shreya Shankar

The Growth Podcast·a month ago

Build Custom Internal Tools to Make Reviewing AI Product Data Frictionless

Reviewing user interaction data is the highest ROI activity for improving an AI product. Instead of relying solely on third-party observability tools, high-performing teams build simple, custom internal applications. These tools are tailored to their specific data and workflow, removing all friction from the process of looking at and annotating traces.

Why AI evals are the hottest new skill for product builders | Hamel Husain & Shreya Shankar (creators of the #1 eval course)

Lenny's Podcast: Product | Career | Growth·5 months ago