Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

The distinction between "open-source" and "open-weight" is critical. Without access to the training data, users cannot know what biases or censorship have been built into an AI model. DeepSeek's pro-China stance on Taiwan is a clear example of this hidden influence.

Related Insights

AI models trained on sources like Wikipedia inherit their biases. Wikipedia's policy of not allowing citations from leading conservative publications means these viewpoints are systematically excluded from training data, creating an inherent left-leaning bias in the resulting AI models.

A key disincentive for open-sourcing frontier AI models is that the released model weights contain residual information about the training process. Competitors could potentially reverse-engineer the training data set or proprietary algorithms, eroding the creator's competitive advantage.

When buying AI solutions, demand transparency from vendors about the specific models and prompts they use. Mollick argues that 'we use a prompt' is not a defensible 'secret sauce' and that this transparency is crucial for auditing results and ensuring you aren't paying for outdated or flawed technology.

The open-source model ecosystem enables a community dedicated to removing safety features. A simple search for 'uncensored' on platforms like Hugging Face reveals thousands of models that have been intentionally fine-tuned to generate harmful content, creating a significant challenge for risk mitigation efforts.

DeepSeek's V4 model, while not frontier-level, is drastically cheaper than US counterparts. This makes it highly attractive for most business use cases, creating a national security risk if US companies become dependent on Chinese-controlled, open-source AI infrastructure that could be altered or restricted, leaving them strategically vulnerable.

Marc Andreessen posits that Chinese firms release strong open-source AI models as a strategic loss leader. Unable to directly sell commercial AI in the West, they offer free models to build global influence and funnel users towards their paid domestic services and related products.

China remains committed to open-weight models, seeing them as beneficial for innovation. Its primary safety strategy is to remove hazardous knowledge (e.g., bioweapons information) from the training data itself. This makes the public model inherently safer, rather than relying solely on post-training refusal mechanisms that can be circumvented.

A common misconception is that Chinese AI is fully open-source. The reality is they are often "open-weight," meaning training parameters (weights) are shared, but the underlying code and proprietary datasets are not. This provides a competitive advantage by enabling adoption while maintaining some control.

To clarify the ambiguous "open source" label, the Openness Index scores models across multiple dimensions. It evaluates not just if the weights are available, but also the degree to which training data, methodology, and code are disclosed. This creates a more useful spectrum of openness, distinguishing "open weights" from true "open science."

The business model for powerful, free, open-source AI models from Chinese companies may not be direct profit. Instead, it could be a strategy to globally distribute an AI trained on a specific worldview, competing with American models on an ideological rather than purely commercial level.