Critics argue moral thought experiments are too unrealistic to be useful. However, their artificiality is a deliberate design choice. By stripping away real-world complexities and extraneous factors, philosophers can focus on whether a single, specific variable is the one making a moral difference in our judgment.
Deontological (rule-based) ethics are often implicitly justified by the good outcomes their rules are presumed to create. If a moral rule was known to produce the worst possible results, its proponents would likely abandon it, revealing a hidden consequentialist foundation for their beliefs.
To overcome its inherent logical incompleteness, an ethical AI requires an external 'anchor.' This anchor must be an unprovable axiom, not a derived value. The proposed axiom is 'unconditional human worth,' serving as the fixed origin point for all subsequent ethical calculations and preventing utility-based value judgments.
The project of creating AI that 'learns to be good' presupposes that morality is a real, discoverable feature of the world, not just a social construct. This moral realist stance posits that moral progress is possible (e.g., abolition of slavery) and that arrogance—the belief one has already perfected morality—is a primary moral error to be avoided in AI design.
Common thought experiments attacking consequentialism (e.g., a doctor sacrificing one patient for five) are flawed because they ignore the full scope of consequences. A true consequentialist analysis would account for the disastrous societal impacts, such as the erosion of trust in medicine, which would make the act clearly wrong.
The famous Trolley Problem isn't just one scenario. Philosophers create subtle variations, like replacing the act of pushing a person with flipping a switch to drop them through a trapdoor. This isolates variables and reveals that our moral objection isn't just about physical contact, but about intentionally using a person as an instrument to achieve a goal.
Instead of relying on instinctual "System 1" rules, advanced AI should use deliberative "System 2" reasoning. By analyzing consequences and applying ethical frameworks—a process called "chain of thought monitoring"—AIs could potentially become more consistently ethical than humans who are prone to gut reactions.
Contrary to popular belief, economists don't assume perfect rationality because they think people are flawless calculators. It's a simplifying assumption that makes models mathematically tractable. The goal is often to establish a theoretical benchmark, not to accurately describe psychological reality.
Under the theory of emotivism, many heated moral debates are not about conflicting fundamental values but rather disagreements over facts. For instance, in a gun control debate, both sides may share the value of 'boo innocent people dying' but disagree on the factual question of which policies will best achieve that outcome.
The core reason we treat the Trolley Problem's two scenarios differently lies in the distinction between intending harm versus merely foreseeing it. Pushing the man means you *intend* for him to block the train (using him as a means). Flipping the switch means you *foresee* a death as a side effect. This principle, known as the doctrine of double effect, is a cornerstone of military and medical ethics.
Thought experiments like the trolley problem artificially constrain choices to derive a specific intuition. They posit perfect knowledge and ignore the most human response: attempting to find a third option, like breaking the trolley, that avoids the forced choice entirely.