AI models trained on sources like Wikipedia inherit their biases. Wikipedia's policy of not allowing citations from leading conservative publications means these viewpoints are systematically excluded from training data, creating an inherent left-leaning bias in the resulting AI models.

Related Insights

The most immediate danger of AI is its potential for governmental abuse. Concerns focus on embedding political ideology into models and porting social media's censorship apparatus to AI, enabling unprecedented surveillance and social control.

To create Grokipedia, XAI trained a version of Grok to be "maximally truth seeking" and skilled at cogent analysis. They then tasked this AI with cycling through Wikipedia's top million articles, using the entire internet to add, modify, and delete information to improve accuracy and context.

There is emerging evidence of a "pay-to-play" dynamic in AI search. Platforms like ChatGPT seem to disproportionately cite content from sources with which they have commercial deals, such as the Financial Times and Reddit. This suggests paid partnerships can heavily influence visibility in AI-generated results.

Richard Sutton, author of "The Bitter Lesson," argues that today's LLMs are not truly "bitter lesson-pilled." Their reliance on finite, human-generated data introduces inherent biases and limitations, contrasting with systems that learn from scratch purely through computational scaling and environmental interaction.

We are months away from AI that can create a media feed designed to exclusively validate a user's worldview while ignoring all contradictory information. This will intensify confirmation bias to an extreme, making rational debate impossible as individuals inhabit completely separate, self-reinforced realities with no common ground or shared facts.

Wikipedia was initially dismissed by academia as unreliable. Over 15 years, its decentralized, community-driven model built immense trust, making it a universally accepted source of truth. This journey from skepticism to indispensability may serve as a blueprint for how society ultimately embraces and integrates artificial intelligence.

When prompted, Elon Musk's Grok chatbot acknowledged that his rival to Wikipedia, Grokipedia, will likely inherit the biases of its creators and could mirror Musk's tech-centric or libertarian-leaning narratives.

Analysis shows Reddit's relationship advice has shifted over 15 years to favor breakups and setting boundaries over compromise. As large language models are heavily trained on this data, they may be systemically biased towards recommending relationship termination to users seeking advice, reflecting a cultural shift in their training corpus.

When all major AI models are trained on the same internet data, they develop similar internal representations ("latent spaces"). This creates a monoculture where a single exploit or "memetic virus" could compromise all AIs simultaneously, arguing for the necessity of diverse datasets and training methods.

A comedian is training an AI on sounds her fetus hears. The model's outputs, including referencing pedophilia after news exposure, show that an AI’s flaws and biases are a direct reflection of its training data—much like a child learning to swear from a parent.