Instead of writing lengthy Product Requirement Documents, PMs at OpenAI build functional prototypes directly in Codex. This is paired with a short "companion doc" or FAQ, making the product itself, not the document, the centerpiece for discussion and alignment.
An international growth PM at OpenAI built Codex automations to manage his core daily workflow. This includes a daily Slack inbox triage that summarizes unread and unanswered messages from key people, as well as a generator for weekly stakeholder updates.
At companies like OpenAI, the "currency of progress" with research teams is "evals" (evaluations). To get researchers excited about improving a specific problem, a PM must be able to frame it as a measurable eval with a clear rubric, test scenarios, and a target state.
Product management at OpenAI is defined by ambiguity because the full capabilities and emergent behaviors of the next model are unknown even to the team building it. This requires PMs to maintain extremely flexible roadmaps that can adapt quickly as research breakthroughs occur.
The key value of Codex for a growth PM at OpenAI wasn't just viewing a single dashboard, but building a unified web app that pulls from multiple scattered sources (Databricks, Tableau). This combines data synthesis with a TLDR summary, overcoming cognitive overload.
Instead of individual use, engineers on OpenAI's growth team created a shared, reusable Codex "skill" for the entire experiment review process. By pointing it to a Statsig experiment, the skill writes hypotheses, monitors progress, and generates a post-mortem with recommendations.
The significant recent advance in AI agent capabilities comes from the "harness" — the system of connectors and tools that allow the core model to interact with external data sources like Slack and Databricks. This infrastructure is seen as a key differentiator, more so than just the model itself.
OpenAI realized that "knowledge workers" are a minority in high-growth markets like India (<10% of workers). To scale internationally, they focused on features with universal appeal, such as Search and Image Generation, which resonate beyond text-heavy professional use cases.
A key tactic for using Codex on an existing codebase is asking an engineer, "What's the most similar thing we have done to this?". The PM then points Codex to that specific part of the repo as a reference, which helps it build on existing patterns instead of navigating the entire codebase from scratch.
At OpenAI, PMs who aren't strong coders use Codex to build features 70-80% of the way to completion, especially when engineering has no bandwidth. This transforms the PM role from a spec-writer to a builder, providing functional prototypes instead of just documents.
To test Codex's capabilities, Abhi Muchhal built a web app that could ingest tax documents and output a completed 1040 form. When he compared its output to his professional accountant's work, Codex's version was more accurate, identifying a forgotten income source.
