Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

In his trial against OpenAI, Elon Musk admitted under oath that using one AI model to train another—a practice known as distillation—is something 'all the companies do.' This confirms that a legally and ethically gray practice is widespread across the industry.

Related Insights

When a company distills knowledge from a competitor's AI, it's not just scraping pre-training data. It's a highly efficient process of extracting the model's intelligence, reasoning patterns, and skills. This is more akin to an apprentice directly interacting with and learning from a world-class expert than simply reading the same textbooks the expert used.

Large, centralized AI models are vulnerable to 'distillation attacks,' where a smaller model can be trained cheaply by querying the larger one. This technical reality, combined with the moral hypocrisy of creators restricting copying after scraping the internet, strongly suggests a future dominated by decentralized, open-source models.

In his lawsuit against OpenAI, Elon Musk's credibility as an AI safety champion was undermined during cross-examination. He was reportedly clueless about basic industry safety practices like "system cards" and OpenAI's own safety protocols, revealing a significant gap between his public pronouncements and his technical knowledge.

The common practice of model distillation suggests that AI capabilities will eventually be commoditized. As smaller models can cheaply mimic larger ones, differentiation will shift away from raw performance to product integration and price, likely triggering a massive price war among providers.

OpenAI's legal team strategically revealed Musk's xAI is "partly distilling" OpenAI's technology. This was used to portray him as a hypocrite—simultaneously claiming the tech is world-ending while also breaking terms of service to improve his own for-profit competitor.

The public-facing models from major labs are likely efficient Mixture-of-Experts (MOE) versions distilled from much larger, private, and computationally expensive dense models. This means the model users interact with is a smaller, optimized copy, not the original frontier model.

As developers increasingly use AI coding assistants like Claude Code, they flood public repositories like GitHub with high-quality, AI-generated outputs. This effectively turns the internet into a massive, unavoidable training dataset for competing models, making it difficult to police "distillation" as a violation of terms.

API providers like Anthropic struggle to differentiate between users distilling models for competitive purposes and those conducting large-scale evaluations. Both activities generate similar high-volume, repetitive API calls, creating a detection challenge that also raises user privacy concerns.

US officials and AI labs allege Chinese firms are engaged in industrial-scale IP theft. They reportedly use fraudulent accounts to extract capabilities from US models like Claude to train their own, creating a facade of domestic innovation.

The US accuses China of "distillation"—querying American AI models millions of times to reverse-engineer their logic and capabilities. This marks a shift from commercial competition to industrial-scale intellectual property theft, escalating the geopolitical conflict beyond government rhetoric.