Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

The AI safety landscape has evolved. The old perspective, from Eliezer Yudkowsky, was a grim "death with dignity"—a likely loss to face honorably. The new view, from thinkers like Holden Karnofsky, is "success without dignity"—a messy, imperfect, but winnable fight with a long list of concrete, helpful projects.

Related Insights

Emmett Shear reframes AI alignment away from a one-time problem to be solved. Instead, he presents it as an ongoing, living process of recalibration and learning, much like how human families or societies maintain cohesion. This challenges the common 'lock in values' approach in AI safety.

The field of AI safety is described as "the business of black swan hunting." The most significant real-world risks that have emerged, such as AI-induced psychosis and obsessive user behavior, were largely unforeseen just years ago, while widely predicted sci-fi threats like bioweapons have not materialized.

The discourse around AI risk has matured beyond sci-fi scenarios like Terminator. The focus is now on immediate, real-world problems such as AI-induced psychosis, the impact of AI romantic companions on birth rates, and the spread of misinformation, requiring a different approach from builders and policymakers.

In the high-stakes race for AGI, nations and companies view safety protocols as a hindrance. Slowing down for safety could mean losing the race to a competitor like China, reframing caution as a luxury rather than a necessity in this competitive landscape.

Sam Harris highlights a key paradox: even if AI achieves its utopian potential by eliminating drudgery without catastrophic downsides, it could still destroy human purpose, solidarity, and culture. The absence of necessary struggle could make life harder, not easier, for most people to live.

A fundamental tension within OpenAI's board was the catch-22 of safety. While some advocated for slowing down, others argued that being too cautious would allow a less scrupulous competitor to achieve AGI first, creating an even greater safety risk for humanity. This paradox fueled internal conflict and justified a rapid development pace.

The most pressing AI safety issues today, like 'GPT psychosis' or AI companions impacting birth rates, were not the doomsday scenarios predicted years ago. This shows the field involves reacting to unforeseen 'unknown unknowns' rather than just solving for predictable, sci-fi-style risks, making proactive defense incredibly difficult.

Shear aligns with arch-doomer Eliezer Yudkowsky on a key point: building a superintelligent AI *as a tool we control* is a path to extinction. Where they differ is on the solution. Yudkowsky sees no viable path, whereas Shear believes 'organic alignment'—creating a being that cares—is a possible alternative.

A major disconnect exists: many VCs believe AGI is near but expect moderate societal change, similar to the last 25 years. In contrast, AI safety futurists believe true AGI will cause a radical transformation comparable to the shift from the hunter-gatherer era to today, all within a few decades.

The AI safety community fears losing control of AI. However, achieving perfect control of a superintelligence is equally dangerous. It grants godlike power to flawed, unwise humans. A perfectly obedient super-tool serving a fallible master is just as catastrophic as a rogue agent.