Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Pigford developed a custom AI skill that acts as an adversarial check on the AI's own code. It's based on the premise that the AI "almost certainly screwed some stuff up," forcing it to re-evaluate and self-correct before human review, which consistently finds bugs.

Related Insights

Anthropic's Claude Code team reports that AI agent skills designed for "verification"—teaching an agent to test and validate its own output—provide an extremely high return on investment. This suggests that building reliability and correctness into AI workflows is as critical, if not more so, than the initial generation capability.

Instead of relying on a single AI model, Josh Pigford's workflow uses Opus for initial code generation and then runs a review pass with a different powerful model like GPT. This adversarial, multi-model process consistently uncovers 3-5 bugs that the primary model overlooks.

To overcome the challenge of reviewing AI-generated code, have different LLMs like Claude and Codex review the code. Then, use a "peer review" prompt that forces the primary LLM to defend its choices or fix the issues raised by its "peers." This adversarial process catches more bugs and improves overall code quality.

Move beyond using AI as an assistant and program it to be a critical sparring partner. Pendo's Field CPO had his AI analyze his codebase and brutally call him out for building a system for himself, not for others, forcing a strategic realignment.

To improve the quality and accuracy of an AI agent's output, spawn multiple sub-agents with competing or adversarial roles. For example, a code review agent finds bugs, while several "auditor" agents check for false positives, resulting in a more reliable final analysis.

An effective method for refining AI output is to instruct the model to adopt an expert persona, such as a "PhD economist," and critically evaluate its own work. This often leads the model to self-identify and correct its own flaws without further prompting.

Pigford built a meta-skill that reviews each development session, including conversations where he repeatedly corrected the AI. It then distills these corrections into a central project document, effectively teaching the AI agent not to make the same mistakes in future sessions.

A powerful technique for creating robust software plans is to use AI as an adversarial partner. After drafting a specification, prompt an AI to "tear it apart" by identifying underspecified or inconsistent points. Iterate on this process until the AI's feedback becomes niche, indicating a solid spec.

The most valuable part of an AI agent skill is a 'gotcha' section. This is where you explicitly instruct the model on its typical failure patterns and wrong assumptions for a given task, preventing common errors before they happen.

An Intercom AI skill for fixing flaky tests goes beyond a simple script. It updates its own internal checklist when it encounters a new type of fix and then proactively searches the codebase for similar problems, creating a 100x impact.