/

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast · Feb 18, 2026

Explore program equilibrium, where AI agents read each other's code to foster cooperation, moving beyond simple checks to robust simulations.

Adding a Random Chance of Cooperation Solves Infinite Loops in AI Simulation

A simple way for AIs to cooperate is to simulate each other and copy the action. However, this creates an infinite loop if both do it. The fix is to introduce a small probability (epsilon) of cooperating unconditionally, which guarantees the simulation chain eventually terminates.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Simple Code-Matching for AI Cooperation Is Too Brittle for Practical Use

Early program equilibrium strategies relied on checking if an opponent's source code was identical. This approach is extremely fragile, as trivial changes like an extra space or a different variable name break cooperation, making it impractical for real-world applications.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Strategic AIs Must Distinguish Agents from Environmental Noise

The decision to cooperate hinges on whether an AI perceives an object as a strategic agent or a non-strategic part of the environment (e.g., a water bottle). This classification is fundamental but difficult, as misinterpreting the environment could lead to being exploited or failing to cooperate when beneficial.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Different Advanced AI Cooperation Strategies Can Successfully Interoperate

Despite different mechanisms, advanced cooperative strategies like proof-based (Loebian) and simulation-based (epsilon-grounded) bots can successfully cooperate. This suggests a potential for robust interoperability between independently designed rational agents, a positive sign for AI safety.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Advanced AIs Face a Dilemma: Exploit Naive Bots or Risk Deception

A robust AI will cooperate with a simple "always cooperate" bot, making it exploitable. However, choosing to defect is risky. A sophisticated adversary could present a simple bot to test for predatory behavior, making the decision dependent on beliefs about the opponent's strategic depth.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

AI Agents Use 'Program Equilibrium' to Cooperate by Inspecting Source Code

In program equilibrium, players submit computer programs instead of actions. These programs can read each other's source code, allowing them to verify cooperative intent and overcome dilemmas like the Prisoner's Dilemma, which is impossible in standard game theory.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Program Equilibrium Theory Models Real-World AI and Institutional Interactions

Program equilibrium isn't just an abstract concept; it serves as a direct model for how autonomous AI systems could interact. It also provides a powerful analogy for human institutions like governments, where laws and constitutions act as a transparent "source code" governing their behavior.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Simulation-Based AI Cooperation Trades Longer Runtimes for Higher Certainty

The "epsilon-grounded" simulation approach has a hidden cost: its runtime is inversely proportional to epsilon. To be very certain that simulations will terminate (a small epsilon), agents must accept potentially very long computation times, creating a direct trade-off between speed and reliability.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

AIs Can Use an Obscure Logic Theorem to Achieve Robust Cooperation

To overcome brittle code-matching, AIs can use formal logic to prove cooperative intent. This is enabled by Löb's Theorem, an obscure result which allows a program to conclude "my opponent cooperates" without falling into an infinite loop of reasoning, creating a robust cooperative equilibrium.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Program Equilibrium's 'Folk Theorems' Create a Coordination Paradox for AIs

A key finding is that almost any outcome better than mutual punishment can be a stable equilibrium (a "folk theorem"). While this enables cooperation, it creates a massive coordination problem: with so many possible "good" outcomes, agents may fail to converge on the same one, leading to suboptimal results.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Shared Public Randomness Is Key to Stable AI Cooperation; Private Randomness Cripples It

In multi-agent simulations, if agents use a shared source of randomness, they can achieve stable equilibria. If they use private randomness, coordinating punishment becomes nearly impossible because one agent cannot verify if another's defection was malicious or a justified response to a third party's actions.

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

Shared Random Number Sequences Let AIs Simulate Complex Multi-Agent Scenarios

Simulating strategies with memory (like "grim trigger") or with multiple players causes an exponential explosion of simulation branches. This can be solved by having all simulated agents draw from the same shared sequence of random numbers, which forces all simulation branches to halt at the same conceptual "time step."

49 - Caspar Oesterheld on Program Equilibrium thumbnail

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·2 days ago

RiffOn - 49 - Caspar Oesterheld on Program Equilibrium | AXRP - the AI X-risk Research Podcast