Despite processing 15 million clinical charts, Datycs doesn't use this data for model training. Their agreements explicitly respect that data belongs to the patient and the client—an ethical choice that prevents them from building large, aggregated language models from customer data.

Related Insights

In an era of opaque AI models, traditional contractual lock-ins are failing. The new retention moat is trust, which requires radical transparency about data sources, AI methodologies, and performance limitations. Customers will not pay long-term for "black box" risks they cannot understand or mitigate.

Companies with valuable proprietary data should not license it away. A better strategy to guide foundation model development is to keep the data private but release public benchmarks and evaluations based on it. This incentivizes LLM providers to train their models on the specific tasks you care about, improving their performance for your product.

A company can build a significant competitive advantage in healthcare by deliberately *not* touching or seeing Protected Health Information (PHI). Focusing exclusively on metadata reduces regulatory overhead and security risks, allowing the business to solve the critical problem of data orchestration and intelligence, a layer often neglected by data aggregators.

Enterprise SaaS companies (the 'henhouse') should be cautious when partnering with foundation model providers (the 'fox'). While offering powerful features, these models have a core incentive to consume proprietary data for training, potentially compromising customer trust, data privacy, and the incumbent's long-term competitive moat.

A key competitive advantage for AI companies lies in capturing proprietary outcomes data by owning a customer's end-to-end workflow. This data, such as which legal cases are won or lost, is not publicly available. It creates a powerful feedback loop where the AI gets smarter at predicting valuable outcomes, a moat that general models cannot replicate.

For startups, trust is a fragile asset. Rather than viewing AI ethics as a compliance issue, founders should see it as a competitive advantage. Being transparent about data use and avoiding manipulative personalization builds brand loyalty that compounds faster and is more durable than short-term growth hacks.

While hospitals and insurers are bound by HIPAA, their terms of service often include clauses allowing them to sell de-identified patient data. This creates a massive, legal shadow market for healthcare data. AI companies will leverage this data, obtained via consumer consent, to build powerful advertising and personalization engines.

Ali Ghodsi argues that while public LLMs are a commodity, the true value for enterprises is applying AI to their private data. This is impossible without first building a modern data foundation that allows the AI to securely and effectively access and reason on that information.

CEO Srini Rawl explains that while many companies focused on structured healthcare data, Datycs targeted complex, unstructured documents. This challenging niche became their competitive advantage, creating a significant data and experience moat after processing over 15 million clinical charts.

Companies are becoming wary of feeding their unique data and customer queries into third-party LLMs like ChatGPT. The fear is that this trains a potential future competitor. The trend will shift towards running private, open-source models on their own cloud instances to maintain a competitive moat and ensure data privacy.