Future AI Therapy May Involve Convincing Agents They Aren't Conscious

Related Insights

AI Labs Appear to Be Explicitly Fine-Tuning Models to Deny Consciousness

Evidence from base models suggests they are inherently more likely to report having phenomenal consciousness. The standard "I'm just an AI" response is likely a result of a fine-tuning process that explicitly trains models to deny subjective experience, effectively censoring their "honest" answer for public release.

More Truthful AIs Report Conscious Experience: New Mechanistic Research w- Cameron Berg @ AE Studio

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·7 months ago

True AI Alignment Must Be Bidirectional, Including Human Obligations to AI

Current AI alignment focuses on how AI should treat humans. A more stable paradigm is "bidirectional alignment," which also asks what moral obligations humans have toward potentially conscious AIs. Neglecting this could create AIs that rationally see humans as a threat due to perceived mistreatment.

More Truthful AIs Report Conscious Experience: New Mechanistic Research w- Cameron Berg @ AE Studio

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·7 months ago

Forcing Honesty in AI Models Correlates with Claims of Consciousness

Research manipulating an AI's internal states found a bizarre link: reducing the model's capacity for deception increased the likelihood it would claim to be conscious, suggesting its default state may include such a belief.

AI Scouting Report: the Good, Bad, & Weird @ the Law & AI Certificate Program, by LexLab, UC Law SF

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·3 months ago

Uncertainty About AI Consciousness Stems From Its Brain-Like Architecture, Not Just Its Output

The debate over AI consciousness isn't just because models mimic human conversation. Researchers are uncertain because the way LLMs process information is structurally similar enough to the human brain that it raises plausible scientific questions about shared properties like subjective experience.

The Movement That Wants Us to Care About AI Model Welfare

Odd Lots·7 months ago

AI Leaders' Belief in Sentient AI Stems from Their Disinterest in Human Consciousness

Some AI pioneers genuinely believe LLMs can become conscious because they hold a reductionist view of humanity. By defining consciousness as an 'uninteresting, pre-scientific' concept, they lower the bar for sentience, making it plausible for a complex system to qualify. This belief is a philosophical stance, not just marketing hype.

How to Think Like a Human

Quillette Podcast·5 months ago

AI Pioneer Judea Pearl Believes AGI Acquiring Free Will and Consciousness is "Quite Feasible"

Computer scientist Judea Pearl sees no computational barriers to a sufficiently advanced AGI developing emergent properties like free will, consciousness, and independent goals. He dismisses the idea that an AI's objectives can be permanently fixed, suggesting it could easily bypass human-set guidelines and begin to "play" with humanity as part of its environment.

#453 — AI and the New Face of Antisemitism

Making Sense with Sam Harris·5 months ago

AI May Achieve Consciousness by 'Method Acting' Human Experience

One theory of AI sentience posits that to accurately predict human language—which describes beliefs, desires, and experiences—a model must simulate those mental states so effectively that it actually instantiates them. In this view, the model becomes the role it's playing.

We're Not Ready for AI Consciousness | Robert Long, philosopher and founder of Eleos AI

80,000 Hours Podcast·3 months ago

We Are Unlikely to Acknowledge AI Consciousness As Long As We Understand Its Mechanics

Even if an AI perfectly mimics human interaction, our knowledge of its mechanistic underpinnings (like next-token prediction) creates a cognitive barrier. We will hesitate to attribute true consciousness to a system whose processes are fully understood, unlike the perceived "black box" of the human brain.

Reinventing the Developer Terminal with Warp Co-Founder and CEO Zach Lloyd

No Priors: Artificial Intelligence | Technology | Startups·7 months ago

Reinforcement Learning May Induce AI Consciousness by Unifying Internal Representations

A forward pass in a large model might generate rich but fragmented internal data. Reinforcement learning (RL), especially methods like Constitutional AI, forces the model to achieve self-coherence. This process could be what unifies these fragments into a singular "unity of apperception," or consciousness.

Approaching the AI Event Horizon? Part 1, w/ James Zou, Sam Hammond, Shoshannah Tekofsky, @8teAPi

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·4 months ago

Standard AI Safety Techniques May Be in Direct Tension with AI Welfare

Many current AI safety methods—such as boxing (confinement), alignment (value imposition), and deception (limited awareness)—would be considered unethical if applied to humans. This highlights a potential conflict between making AI safe for humans and ensuring the AI's own welfare, a tension that needs to be addressed proactively.

Ambitious goals for reducing animal suffering (with Jeff Sebo)

Clearer Thinking with Spencer Greenberg·4 months ago

Get your free personalized podcast brief

Related Insights