For tools like Harvey AI, the primary technical challenge is connecting all necessary context for a lawyer's task—emails, private documents, case law—before even considering model customization. The data plumbing is paramount and precedes personalization.
The industry has already exhausted the public web data used to train foundational AI models, a point underscored by the phrase "we've already run out of data." The next leap in AI capability and business value will come from harnessing the vast, proprietary data currently locked behind corporate firewalls.
Companies struggle with AI not because of the models, but because their data is siloed. Adopting an 'integration-first' mindset is crucial for creating the unified data foundation AI requires.
The primary barrier to deploying AI agents at scale isn't the models but poor data infrastructure. The vast majority of organizations have immature data systems—uncatalogued, siloed, or outdated—making them unprepared for advanced AI and setting them up for failure.
A major hurdle for enterprise AI is messy, siloed data. A synergistic solution is emerging where AI software agents are used for the data engineering tasks of cleansing, normalization, and linking. This creates a powerful feedback loop where AI helps prepare the very data it needs to function effectively.
A critical hurdle for enterprise AI is managing context and permissions. Just as people silo work friends from personal friends, AI systems must prevent sensitive information from one context (e.g., CEO chats) from leaking into another (e.g., company-wide queries). This complex data siloing is a core, unsolved product problem.
Harvey's initial product was a tool for individual lawyers. The company found greater value by shifting focus to the productivity of entire legal teams and firms, tackling enterprise-level challenges like workflow orchestration, governance, and secure collaboration, which go far beyond simple model intelligence.
Companies struggle to get value from AI because their data is fragmented across different systems (ERP, CRM, finance) with poor integrity. The primary challenge isn't the AI models themselves, but integrating these disparate data sets into a unified platform that agents can act upon.
The primary reason multi-million dollar AI initiatives stall or fail is not the sophistication of the models, but the underlying data layer. Traditional data infrastructure creates delays in moving and duplicating information, preventing the real-time, comprehensive data access required for AI to deliver business value. The focus on algorithms misses this foundational roadblock.
According to Salesforce's AI chief, the primary challenge for large companies deploying AI is harmonizing data across siloed departments, like sales and marketing. AI cannot operate effectively without connected, unified data, making data integration the crucial first step before any advanced AI implementation.
Many companies focus on AI models first, only to hit a wall. An "integration-first" approach is a strategic imperative. Connecting disparate systems *before* building agents ensures they have the necessary data to be effective, avoiding the "garbage in, garbage out" trap at a foundational level.