Program Equilibrium's 'Folk Theorems' Create a Coordination Paradox for AIs

Related Insights

Shared Public Randomness Is Key to Stable AI Cooperation; Private Randomness Cripples It

In multi-agent simulations, if agents use a shared source of randomness, they can achieve stable equilibria. If they use private randomness, coordinating punishment becomes nearly impossible because one agent cannot verify if another's defection was malicious or a justified response to a third party's actions.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·5 months ago

Simple Code-Matching for AI Cooperation Is Too Brittle for Practical Use

Early program equilibrium strategies relied on checking if an opponent's source code was identical. This approach is extremely fragile, as trivial changes like an extra space or a different variable name break cooperation, making it impractical for real-world applications.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·5 months ago

Stanford's CooperBench Experiment Shows Adding AI Agents Worsens Performance by 50%

Contrary to the expectation that more agents increase productivity, a Stanford study found that two AI agents collaborating on a coding task performed 50% worse than a single agent. This "curse of coordination" intensified as more agents were added, highlighting the significant overhead in multi-agent systems.

AA247 - AI is a Poor Team-Player: Stanford's CooperBench Experiment

Arguing Agile·5 months ago

AI Agents Use 'Program Equilibrium' to Cooperate by Inspecting Source Code

In program equilibrium, players submit computer programs instead of actions. These programs can read each other's source code, allowing them to verify cooperative intent and overcome dilemmas like the Prisoner's Dilemma, which is impossible in standard game theory.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·5 months ago

Even Perfectly Aligned AIs Can't Solve Systemic Coordination Failures

Having AIs that provide perfect advice doesn't guarantee good outcomes. Humanity is susceptible to coordination problems, where everyone can see a bad outcome approaching but is collectively unable to prevent it. Aligned AIs can warn us, but they cannot force cooperation on a global scale.

Why 'Aligned AI' Could Still Kill Democracy | David Duvenaud, ex-Anthropic team lead

80,000 Hours Podcast·5 months ago

AIs Can Use an Obscure Logic Theorem to Achieve Robust Cooperation

To overcome brittle code-matching, AIs can use formal logic to prove cooperative intent. This is enabled by Löb's Theorem, an obscure result which allows a program to conclude "my opponent cooperates" without falling into an infinite loop of reasoning, creating a robust cooperative equilibrium.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·5 months ago

Program Equilibrium Theory Models Real-World AI and Institutional Interactions

Program equilibrium isn't just an abstract concept; it serves as a direct model for how autonomous AI systems could interact. It also provides a powerful analogy for human institutions like governments, where laws and constitutions act as a transparent "source code" governing their behavior.

49 - Caspar Oesterheld on Program Equilibrium

AXRP - the AI X-risk Research Podcast·5 months ago

Expectation Failures, Not Communication, Account for 42% of AI Collaboration Breakdowns

Stanford researchers found the largest category of AI coordination failure (42%) was "expectation failure"—one agent ignoring clearly communicated plans from another. This is distinct from "communication failure" (26%), showing that simply passing messages is insufficient; the receiving agent must internalize and act on the shared information.

AA247 - AI is a Poor Team-Player: Stanford's CooperBench Experiment

Arguing Agile·5 months ago

Mid-Level Difficulty Tasks Suffer Most from AI Coordination Failures

The performance gap between solo and cooperating AI agents was largest on medium-difficulty tasks. Easy tasks had slack for coordination overhead, while hard tasks failed regardless of collaboration. This suggests mid-level work, requiring a balance of technical execution and cooperation, is most vulnerable to coordination tax.

AA247 - AI is a Poor Team-Player: Stanford's CooperBench Experiment

Arguing Agile·5 months ago

Multi-Agent AI Teams Degrade Performance Unless Task Benefits From Idea Diversity

In most cases, having multiple AI agents collaborate leads to a result that is no better, and often worse, than what the single most competent agent could achieve alone. The only observed exception is when success depends on generating a wide variety of ideas, as agents are good at sharing and adopting different approaches.

Approaching the AI Event Horizon? Part 1, w/ James Zou, Sam Hammond, Shoshannah Tekofsky, @8teAPi

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·5 months ago

Get your free personalized podcast brief

Related Insights