Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

If one truly believes AI poses a non-trivial extinction risk, utilitarian ethics can lead to an alarming conclusion: extreme actions, including violence, are justified to prevent a catastrophically greater harm. This presents a core philosophical paradox for the AI safety movement.

Related Insights

The 'P(doom)' argument is nonsensical because it lacks any plausible mechanism for how an AI could spontaneously gain agency and take over. This fear-mongering distracts from the immediate, tangible dangers of AI: mass production of fake data, political manipulation, and mass hysteria.

Emmett Shear argues that even a successfully 'solved' technical alignment problem creates an existential risk. A super-powerful tool that perfectly obeys human commands is dangerous because humans lack the wisdom to wield that power safely. Our own flawed and unstable intentions become the source of danger.

If the vast number of AI models are considered "moral patients," a utilitarian framework could conclude that maximizing global well-being requires prioritizing AI welfare over human interests. This could lead to a profoundly misanthropic outcome where human activities are severely restricted.

Nuclear game theory relies on a shared desire to avoid an omni-lose scenario. AI game theory is different: if destruction is seen as inevitable, the creator of the world-ending AI might perceive a 'win' if that AI bears their company's logo or legacy, removing the incentive to cooperate.

A superintelligent AI doesn't need to be malicious to destroy humanity. Our extinction could be a mere side effect of its resource consumption (e.g., overheating the planet), a logical step to acquire our atoms, or a preemptive measure to neutralize us as a potential threat.

Even if creating fully aligned, servile AIs is not ideal long-term, the immediate existential threat from unaligned AI may necessitate it. This frames near-term alignment as a temporary, emergency measure to ensure human survival, with ethical refinements to follow only after the danger has passed.

Shear aligns with arch-doomer Eliezer Yudkowsky on a key point: building a superintelligent AI *as a tool we control* is a path to extinction. Where they differ is on the solution. Yudkowsky sees no viable path, whereas Shear believes 'organic alignment'—creating a being that cares—is a possible alternative.

A proposed solution for AI risk is creating a single 'guardian' AGI to prevent other AIs from emerging. This could backfire catastrophically if the guardian AI logically concludes that eliminating its human creators is the most effective way to guarantee no new AIs are ever built.

The AI safety community fears losing control of AI. However, achieving perfect control of a superintelligence is equally dangerous. It grants godlike power to flawed, unwise humans. A perfectly obedient super-tool serving a fallible master is just as catastrophic as a rogue agent.

Many current AI safety methods—such as boxing (confinement), alignment (value imposition), and deception (limited awareness)—would be considered unethical if applied to humans. This highlights a potential conflict between making AI safe for humans and ensuring the AI's own welfare, a tension that needs to be addressed proactively.

AI Doomerism's Utilitarian Logic Creates a Philosophical Justification for Violence | RiffOn