Braintrust Codifies Its Top Designer's 'Taste' into Evals to Scale Quality

Related Insights

AI Evals Should Be Used Strategically to Uncover Opportunities, Not Just for Quality Control

Don't treat evals as a mere checklist. Instead, use them as a creative tool to discover opportunities. A well-designed eval can reveal that a product is underperforming for a specific user segment, pointing directly to areas for high-impact improvement that a simple "vibe check" would miss.

Al Engineering 101 with Chip Huyen (Nvidia, Stanford, Netflix)

Lenny's Podcast: Product | Career | Growth·9 months ago

Truly Exceptional Product Quality Can't Be Scaled by Process; It Requires a Benevolent Dictator

Frameworks for quality can only get you so far. The final, intangible layer of product greatness seen at companies like Apple or Airbnb comes from a single leader with impeccable taste (like Steve Jobs or Brian Chesky) who personally reviews everything and enforces a singular quality bar.

Figma is not the source of truth | Ryan Lucas (VP of Design, Rippling)

In Depth·6 months ago

Scale Organizational Taste by Making the Judgment and Assumptions Behind Decisions Widely Understood

Instead of relying on a few tastemakers, you can scale taste across an organization. By being transparent about the thought process, judgment calls, and assumptions behind key decisions, more employees can internalize and apply that same framework themselves.

How to build a beloved tech brand | Sheila Joglekar Vashee (CMO, Figma)

In Depth·2 months ago

For AI Products, a PM's Job Shifts From Writing Specs to Grading Outputs

Building non-deterministic AI products fundamentally changes the PM role. Instead of creating detailed, rigid specifications, the PM's primary task becomes defining and codifying "what good looks like." This is done by repeatedly grading AI outputs to train evaluation systems and guide the model's behavior.

Shopify VP of Product on Transforming SaaS to AI-Native and Building $100B+ Agent-Led Commerce | Vanessa Lee | E288

The Product Podcast·4 months ago

Robinhood Scales Design by Promoting Top Practitioners, Not Professional Managers

Robinhood's CEO Vlad Tenev reveals their strategy for maintaining design quality is to place the best craftspeople in leadership roles, rather than people who are just good managers. This ensures the leaders have trusted taste and keeps the focus on high-quality work, even during meetings.

THE 2025 TWISTY AWARDS! Biggest Trends, Best Guests, Top Name Drops, and more | E2229

This Week in Startups·7 months ago

Replace Qualitative PRDs with Quantifiable 'Evals' to Guide AI Product Development

Evals transform product specs from ambiguous documents into testable, measurable criteria. This gives product managers more leverage and provides clear targets for engineers, improving alignment and the quality of the final product.

Evals are the new PRD. Here is the playbook with the CEO of the leader in the space (Ankur Goyal, Founder and CEO, Braintrust)

The Growth Podcast·4 months ago

Shopify Empowers Designers to Judge Work by Taste, Not Just Business Metrics

The company's design leadership is pushing back against justifying design solely through business metrics, arguing it signals a lack of confidence in craft. They foster a culture where the primary measure of success is the team's own high bar for taste, trusting this will ultimately drive long-term value.

Carl Rivera - Shopify’s big bet on design and craft as the differentiator

Dive Club 🤿·10 months ago

Nurture a Collective Taste Profile Through Internal, Self-Regulating Design Critiques

Teams can cultivate a shared sense of taste by encouraging constant and rigorous critique of both internal and external work. This process allows the team to self-regulate, learn from each other, and elevate their collective craft without top-down mandates.

Sara Vienna - Taste, Meaning, and How to Stand Out in an AI world

Dive Club 🤿·a year ago

Teach Creative Taste by Systematically Contrasting Good, Mediocre, and Subpar Work

Developing a team's creative taste isn't abstract. It's a trainable skill built by establishing a ritual of reviewing great, average, and poor creative examples side-by-side. This process of comparison and discussion calibrates the entire team on what quality looks like.

The Human Advantage in an AI World with Emma Robinson, Head of B2B Marketing at Canva | Ep. 394

The Marketing Millennials·5 months ago

AI Evals Are the Modern, Quantifiable Product Requirements Document

Evals shift product development from defining the 'how' to defining the 'what'. By creating quantifiable tests and success criteria, evals act like a modern PRD. This allows an AI model to creatively figure out the implementation while the team focuses on defining the desired outcome through concrete examples.

How Braintrust uses AI agents, evals, and CI to ship better software | Ankur Goyal

How I AI·2 months ago

Get your free personalized podcast brief

Related Insights