A key new feature in the Opus 4.6 API is "Adaptive Thinking," which lets developers specify the level of effort the model applies to a task. Setting the effort to 'max' forces the model to think without constraints on depth, a powerful but resource-intensive option exclusive to the new version.
The latest models from Anthropic (Opus 4.6) and OpenAI (Codex 5.3) represent two distinct engineering methodologies. Opus is an autonomous agent you delegate to, while Codex is an interactive collaborator you pair-program with. Choosing a model is now a workflow decision, not just a performance one.
Unlike previous models that frequently failed, Opus 4.5 allows for a fluid, uninterrupted coding process. The AI can build complex applications from a simple prompt and autonomously fix its own errors, representing a significant leap in capability and reliability for developers.
Including the special keyword "ultra think" in a prompt signals Claude Code to engage in a more intensive thought process. This undocumented feature can significantly improve the quality of output for difficult bugs or complex tasks without noticeably impacting token usage.
The new multi-agent architecture in Opus 4.6, while powerful, dramatically increases token consumption. Each agent runs its own process, multiplying token usage for a single prompt. This is a savvy business strategy, as the model's most advanced feature is also its most lucrative for Anthropic.
Many developers are failing to access key new features like "Agent Teams" in Anthropic's Opus 4.6. The issue is often a simple configuration oversight. You must manually enable experimental features in your settings.json file and ensure your packages are updated to leverage the model's full capabilities.
Beyond standard benchmarks, Anthropic fine-tunes its models based on their "eagerness." An AI can be "too eager," over-delivering and making unwanted changes, or "too lazy," requiring constant prodding. Finding the right balance is a critical, non-obvious aspect of creating a useful and steerable AI assistant.
The differing capabilities of new AI models align with distinct engineering roles. Anthropic's Opus 4.6 acts like a thoughtful "staff engineer," excelling at code comprehension and architectural refactors. In contrast, OpenAI's Codex 5.3 is the scrappy "founding engineer," optimized for rapid, end-to-end application generation.
The traditional lever of `temperature` for controlling model creativity has been superseded in modern reasoning models, where it's often fixed. The new critical parameter is the "thinking budget"—the amount of reasoning tokens a model can use before responding. A larger budget allows for more internal review and higher-quality outputs.
Effective prompting requires adapting your language to the AI's core design. For Anthropic's agent-based Opus 4.6, the optimal prompt is to "create an agent team" with defined roles. For OpenAI's monolithic Codex 5.3, the equivalent prompt is to instruct it to "think deeply" about those same roles itself.
In a head-to-head test to build a Polymarket clone, Anthropic's Opus 4.6 produced a visually polished, feature-rich app. OpenAI's Codex 5.3 was faster but delivered a basic MVP that required multiple design revisions. The multi-agent "research first" approach of Opus resulted in a superior initial product.