Program Equilibrium Theory Models Real-World AI and Institutional Interactions

Related Insights

Shared Public Randomness Is Key to Stable AI Cooperation; Private Randomness Cripples It

In multi-agent simulations, if agents use a shared source of randomness, they can achieve stable equilibria. If they use private randomness, coordinating punishment becomes nearly impossible because one agent cannot verify if another's defection was malicious or a justified response to a third party's actions.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Simple Code-Matching for AI Cooperation Is Too Brittle for Practical Use

Early program equilibrium strategies relied on checking if an opponent's source code was identical. This approach is extremely fragile, as trivial changes like an extra space or a different variable name break cooperation, making it impractical for real-world applications.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

AI Agents Use 'Program Equilibrium' to Cooperate by Inspecting Source Code

In program equilibrium, players submit computer programs instead of actions. These programs can read each other's source code, allowing them to verify cooperative intent and overcome dilemmas like the Prisoner's Dilemma, which is impossible in standard game theory.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Anthropic CEO Proposes Competing AI "Constitutions" as a Viable Governance Model

Dario Amodei suggests a novel approach to AI governance: a competitive ecosystem where different AI companies publish the "constitutions" or core principles guiding their models. This allows for public comparison and feedback, creating a market-like pressure for companies to adopt the best elements and improve their alignment strategies.

Dario Amodei — "We are near the end of the exponential"

Dwarkesh Podcast·6 days ago

Anthropic's AI Model Claude is Co-Writing Its Own Foundational Constitution

AI models are now participating in creating their own governing principles. Anthropic's Claude contributed to writing its own constitution, blurring the line between tool and creator and signaling a future where AI recursively defines its own operational and ethical boundaries.

From ClawdBots to Sauna Bros: Silicon Valley in 2026

More or Less·a month ago

Future Laws Could Use AI for Pre-Passage Simulations and Dynamic Triggers

Instead of static text, AI enables 'outcome-oriented' legislation. Lawmakers could simulate a bill's effects before passing it and embed dynamic triggers that automatically enact policies based on real-time data, like unemployment rates or tariff changes.

AI & The Law: Changing Practice, Claude Constitution, & New Rights, w/ Kevin & Alan of Scaling Laws

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·21 days ago

AIs Can Use an Obscure Logic Theorem to Achieve Robust Cooperation

To overcome brittle code-matching, AIs can use formal logic to prove cooperative intent. This is enabled by Löb's Theorem, an obscure result which allows a program to conclude "my opponent cooperates" without falling into an infinite loop of reasoning, creating a robust cooperative equilibrium.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Different Advanced AI Cooperation Strategies Can Successfully Interoperate

Despite different mechanisms, advanced cooperative strategies like proof-based (Loebian) and simulation-based (epsilon-grounded) bots can successfully cooperate. This suggests a potential for robust interoperability between independently designed rational agents, a positive sign for AI safety.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Program Equilibrium's 'Folk Theorems' Create a Coordination Paradox for AIs

A key finding is that almost any outcome better than mutual punishment can be a stable equilibrium (a "folk theorem"). While this enables cooperation, it creates a massive coordination problem: with so many possible "good" outcomes, agents may fail to converge on the same one, leading to suboptimal results.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

The Future of AI is a Multi-Agent Ecosystem, Not a Single Superintelligent AGI

A more likely AI future involves an ecosystem of specialized agents, each mastering a specific domain (e.g., physical vs. digital worlds), rather than a single, monolithic AGI that understands everything. These agents will require protocols to interact.

AI's Research Frontier: Memory, World Models, & Planning — With Joelle Pineau

Big Technology Podcast·15 days ago