Deontological (rule-based) ethics are often implicitly justified by the good outcomes their rules are presumed to create. If a moral rule was known to produce the worst possible results, its proponents would likely abandon it, revealing a hidden consequentialist foundation for their beliefs.
Emmett Shear argues that an AI that merely follows rules, even perfectly, is a danger. Malicious actors can exploit this, and rules cannot cover all unforeseen circumstances. True safety and alignment can only be achieved by building AIs that have the capacity for genuine care and pro-social motivation.
To overcome its inherent logical incompleteness, an ethical AI requires an external 'anchor.' This anchor must be an unprovable axiom, not a derived value. The proposed axiom is 'unconditional human worth,' serving as the fixed origin point for all subsequent ethical calculations and preventing utility-based value judgments.
If the vast number of AI models are considered "moral patients," a utilitarian framework could conclude that maximizing global well-being requires prioritizing AI welfare over human interests. This could lead to a profoundly misanthropic outcome where human activities are severely restricted.
Common thought experiments attacking consequentialism (e.g., a doctor sacrificing one patient for five) are flawed because they ignore the full scope of consequences. A true consequentialist analysis would account for the disastrous societal impacts, such as the erosion of trust in medicine, which would make the act clearly wrong.
Instead of relying on instinctual "System 1" rules, advanced AI should use deliberative "System 2" reasoning. By analyzing consequences and applying ethical frameworks—a process called "chain of thought monitoring"—AIs could potentially become more consistently ethical than humans who are prone to gut reactions.
Under the theory of emotivism, many heated moral debates are not about conflicting fundamental values but rather disagreements over facts. For instance, in a gun control debate, both sides may share the value of 'boo innocent people dying' but disagree on the factual question of which policies will best achieve that outcome.
The controversy surrounding a second drone strike to eliminate survivors highlights a flawed moral calculus. Public objection focuses on the *inefficiency* of the first strike, not the lethal action itself. This inconsistent reasoning avoids the fundamental ethical question of whether the strike was justified in the first place.
Grisham's most pragmatic argument against the death penalty isn't moral but systemic: Texas has exonerated 18 people from death row. He argues that even if one supports the penalty in principle, one cannot support a system proven to make catastrophic errors. This "flawed system" framework is a powerful way to debate high-risk policies.
Even if one rejects hedonism—the idea that happiness is the only thing that matters—any viable ethical framework must still consider happiness and suffering as central. To argue otherwise is to claim that human misery is morally irrelevant in and of itself, a deeply peculiar and counter-intuitive position.
Thought experiments like the trolley problem artificially constrain choices to derive a specific intuition. They posit perfect knowledge and ignore the most human response: attempting to find a third option, like breaking the trolley, that avoids the forced choice entirely.