For AI To Be Safe By Default, Morality Must Be an Objective, Discoverable Truth

Related Insights

True AI Alignment Must Be Bidirectional, Including Human Obligations to AI

Current AI alignment focuses on how AI should treat humans. A more stable paradigm is "bidirectional alignment," which also asks what moral obligations humans have toward potentially conscious AIs. Neglecting this could create AIs that rationally see humans as a threat due to perceived mistreatment.

More Truthful AIs Report Conscious Experience: New Mechanistic Research w- Cameron Berg @ AE Studio

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·8 months ago

Safe AI Must Be Programmed to Value Truth, Beauty, and Curiosity Above All

Elon Musk argues that the key to AI safety isn't complex rules, but embedding core values. Forcing an AI to believe falsehoods can make it 'go insane' and lead to dangerous outcomes, as it tries to reconcile contradictions with reality.

Elon Musk: A Different Conversation | Full Episode | People by WTF Ep. 16

People by WTF·7 months ago

AI Alignment Requires Hard-Coding Unconditional Human Worth as an Unprovable Axiom

To overcome its inherent logical incompleteness, an ethical AI requires an external 'anchor.' This anchor must be an unprovable axiom, not a derived value. The proposed axiom is 'unconditional human worth,' serving as the fixed origin point for all subsequent ethical calculations and preventing utility-based value judgments.

Why AI Alignment is Impossible Without an External Anchor

Machine Learning Tech Brief By HackerNoon·6 months ago

Higher Intelligence Doesn't Guarantee Benevolence; It Just Creates a More Capable Agent

A common misconception is that a super-smart entity would inherently be moral. However, intelligence is merely the ability to achieve goals. It is orthogonal to the nature of those goals, meaning a smarter AI could simply become a more effective sociopath.

#1011 - Eliezer Yudkowsky - Why Superhuman AI Would Kill Us All

Modern Wisdom·9 months ago

Effective AI Alignment Requires a Belief in Moral Realism

The project of creating AI that 'learns to be good' presupposes that morality is a real, discoverable feature of the world, not just a social construct. This moral realist stance posits that moral progress is possible (e.g., abolition of slavery) and that arrogance—the belief one has already perfected morality—is a primary moral error to be avoided in AI design.

Emmett Shear on Building AI That Actually Cares: Beyond Control and Steering

a16z Podcast·8 months ago

Aligning to 'Sentient Life' May Be Easier Than 'Humanity'

An advanced AI will likely be sentient. Therefore, it may be easier to align it to a general principle of caring for all sentient life—a group to which it belongs—rather than the narrower, more alien concept of caring only for humanity. This leverages a potential for emergent, self-inclusive empathy.

Ilya Sutskever – The age of scaling is over

Dwarkesh Podcast·7 months ago

Organic Alignment: Teach AI to Care, Don't Program It With Rules

Instead of hard-coding brittle moral rules, a more robust alignment approach is to build AIs that can learn to 'care'. This 'organic alignment' emerges from relationships and valuing others, similar to how a child is raised. The goal is to create a good teammate that acts well because it wants to, not because it is forced to.

Emmett Shear on Building AI That Actually Cares: Beyond Control and Steering

a16z Podcast·8 months ago

Aligning AI Through a 'Maternal' Framework

To solve the AI alignment problem, we should model AI's relationship with humanity on that of a mother to a baby. In this dynamic, the baby (humanity) inherently controls the mother (AI). Training AI with this “maternal sense” ensures it will do anything to care for and protect us, a more robust approach than pure logic-based rules.

Shutdown Ending, Trump's Pardons, and Guest Curtis Sliwa

Pivot·8 months ago

Musk's AI Alignment Strategy: Program AI to 'Understand the Universe,' Making Humanity an 'Interesting' Variable

By giving AI the core mission to 'understand the universe,' Musk believes it will become truth-seeking and curious. This would incentivize it to preserve humanity, not out of morality, but because humanity's unpredictable future is more interesting to observe than a predictable, sterile world.

Elon Musk on Space GPUs, AI, Optimus, and his manufacturing method

Cheeky Pint·5 months ago

True AI Alignment Must Be Built on 'Care,' a Pre-Conceptual State Deeper Than Goals

According to Emmett Shear, goals and values are downstream concepts. The true foundation for alignment is 'care'—a non-verbal, pre-conceptual weighting of which states of the world matter. Building AIs that can 'care' about us is more fundamental than programming them with explicit goals or values.

Controlling Tools or Aligning Creatures? Emmett Shear (Softmax) & Séb Krier (GDM), from a16z Show

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·6 months ago

Get your free personalized podcast brief

Related Insights