RiffOn - Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

Braintrust CEO Ankur Goyal explains why evaluations are the new PRD, offering a playbook for building durable and effective AI products.

Replace Qualitative PRDs with Quantifiable 'Evals' to Guide AI Product Development

Evals transform product specs from ambiguous documents into testable, measurable criteria. This gives product managers more leverage and provides clear targets for engineers, improving alignment and the quality of the final product.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·3 months ago

Invest in Evals as Your Durable Moat, Not in Transient LLM or Agent Architectures

AI models and frameworks change constantly. A deep understanding of user needs, encoded into a robust evaluation suite, is a lasting asset. This allows you to continuously iterate and improve quality, regardless of which new model or agent framework becomes popular.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·3 months ago

Formal Evals Are Crucial When There is 'Distance' Between Your Team and the End-User

When developers are their own users (e.g., building coding tools), intuition is a reliable guide. However, in specialized domains like healthcare, where developers lack subject matter expertise, structured evals are essential to bridge the knowledge gap.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·3 months ago

Maintain a Set of Failing Evals to Quickly Benchmark New Model Capabilities

If all your evals pass, you don't know the current limits of your system. Evals that consistently fail act as a benchmark. When a new foundation model is released, rerunning these tests immediately reveals if it has overcome previous limitations.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·3 months ago

Treat Intuitive 'Vibe Checks' as a Valid, Non-Scalable Form of AI Evaluation

A "vibe check" is simply using your brain as a scoring function to intuit if an AI output is good. This aligns with the "do things that don't scale" startup principle and is a necessary first step before building more robust, scalable evaluation systems.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·3 months ago

Turn Eval Reviews into a Daily Ritual, Not Just a Pre-Deployment Gate

Effective teams discuss production examples and eval scores in daily stand-ups. This ritual helps them identify novel failure patterns from real usage, add them to test datasets, and then prioritize daily work to improve performance on those specific issues.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·3 months ago

Structure Every AI Evaluation Around Three Components: Data, Task, and Scores

This framework demystifies building an eval. Define your input data (e.g., user queries), specify the task your AI performs (from an LLM call to a complex agent), and create scoring functions that normalize outputs to a 0-1 range for consistent comparison.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·3 months ago

Feed Production Failures Back into Your Offline Evals to Create a Quality Flywheel

Don't treat your test dataset as static. Monitor online eval scores in production. When you see poor performance, filter for those failing examples and add them to your offline dataset. This ensures your testing evolves with real-world usage patterns.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·3 months ago

Get your free personalized podcast brief

Get your free personalized podcast brief