LLMs Fail at Common Sense Because They Are Trained on the 'Maybe Sphere' of Debatable Text

Related Insights

LLMs Can Predict Words But Can't Predict the Future Without Real-World Understanding

A core debate in AI is whether LLMs, which are text prediction engines, can achieve true intelligence. Critics argue they cannot because they lack a model of the real world. This prevents them from making meaningful, context-aware predictions about future events—a limitation that more data alone may not solve.

#119 OpenAI Sora vs. TikTok: Can “AI Entertainment” Fund the Compute Bill?

More or Less·7 months ago

LLMs' "Jagged Intelligence" Makes Them a Major Enterprise Risk

Salesforce's AI Chief warns of "jagged intelligence," where LLMs can perform brilliant, complex tasks but fail at simple common-sense ones. This inconsistency is a significant business risk, as a failure in a basic but crucial task (e.g., loan calculation) can have severe consequences.

How Salesforce Is Using AI to Power the Enterprise

AI & I·6 months ago

Advanced LLMs Prioritize Grammatical Structure Over Semantic Meaning, a Critical Failure Mode

MIT research reveals that large language models develop "spurious correlations" by associating sentence patterns with topics. This cognitive shortcut causes them to give domain-appropriate answers to nonsensical queries if the grammatical structure is familiar, bypassing logical analysis of the actual words.

The LM Brief: The Syntax Illusion

"World of DaaS"·5 months ago

LLM Knowledge is a Crutch; Future Research Must Isolate the "Cognitive Core"

LLMs learn two things from pre-training: factual knowledge and intelligent algorithms (the "cognitive core"). Karpathy argues the vast memorized knowledge is a hindrance, making models rely on memory instead of reasoning. The goal should be to strip away this knowledge to create a pure, problem-solving cognitive entity.

Andrej Karpathy — AGI is still a decade away

Dwarkesh Podcast·7 months ago

LLMs Inventing Answers Is Analogous to Confabulation in Brain-Damaged Patients

The way LLMs generate confident but incorrect answers mirrors the neurological phenomenon of confabulation, where patients with memory gaps invent plausible stories. This behavior is fundamentally misleading, as humans aren't cognitively prepared to interact with a system that constantly "fills in the blanks" with fiction.

What happens when your co-workers are AIs? (with Evan Ratliff)

Clearer Thinking with Spencer Greenberg·2 months ago

LLMs May Contradict AI's "Bitter Lesson" by Relying on Finite Human Data

Richard Sutton, author of "The Bitter Lesson," argues that today's LLMs are not truly "bitter lesson-pilled." Their reliance on finite, human-generated data introduces inherent biases and limitations, contrasting with systems that learn from scratch purely through computational scaling and environmental interaction.

AI’s Power Problem, Apple Goes Meta on AI Glasses | Pat Gelsinger, Josh Isner, Sheel Mohnot, Santiago Nestares, Austin Federa

TBPN·7 months ago

ChatGPT's "General Knowledge" Defense Exposes Its Unreliability for Journalists

When pressed for sources on factual data, ChatGPT defaults to citing "general knowledge," providing misleading information with unearned confidence. This lack of verifiable sourcing makes it a liability for detail-oriented professions like journalism, requiring more time for correction than it saves in research.

Using ChatGPT as a Reporting Assistant: What Went Wrong?

Machine Learning Tech Brief By HackerNoon·4 months ago

AI Lacks Common Sense Because It Can't Replicate Lived Experience

AI can process vast information but cannot replicate human common sense, which is the sum of lived experiences. This gap makes it unreliable for tasks requiring nuanced judgment, authenticity, and emotional understanding, posing a significant risk to brand trust when used without oversight.

Bionic Branding: How to Build and Protect Corporate Trust in the Age of AI with Gal Borenstein

Growth Hacking Culture·3 months ago

The Omniscience Index Penalizes LLM Hallucination by Rewarding "I Don't Know" Answers

Traditional benchmarks incentivize guessing by only rewarding correct answers. The Omniscience Index directly combats hallucination by subtracting points for incorrect factual answers. This creates a powerful incentive for model developers to train their systems to admit when they lack knowledge, improving reliability.

Artificial Analysis: The Independent LLM Analysis House — with George Cameron and Micah-Hill Smith

Latent Space: The AI Engineer Podcast·4 months ago

Generative AI Has Likely Hit Its Accuracy Ceiling Due to Its Statistical Nature

Contrary to popular belief, generative AI like LLMs may not get significantly more accurate. As statistical engines that predict the next most likely word, they lack true reasoning or an understanding of "accuracy." This fundamental limitation means they will always be prone to making unfixable mistakes.

How AI Could Freeze Progress with Hilary Allen

Masters in Business·3 months ago

Get your free personalized podcast brief

Related Insights