Enterprise AI Search Requires a Hybrid of Lexical and Vector Retrieval

Related Insights

Atlassian’s Teamwork Graph Outperforms RAG for Complex Enterprise Queries

For enterprise AI, standard RAG struggles with granular permissions and relationship-based questions. Atlassian's "teamwork graph" maps entities like teams, tasks, and documents. This allows it to answer complex queries like "What did my team do last week?"—a task where simple vector search would fail by just returning top documents.

Escaping AI Slop: How Atlassian Gives AI Teammates Taste, Knowledge, & Workflows, w- Sherif Mansour

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis·3 months ago

Effective Enterprise AI Requires an "LLM Agnostic Orchestrator" to Deploy the Best Model

Recognizing there is no single "best" LLM, AlphaSense built a system to test and deploy various models for different tasks. This allows them to optimize for performance and even stylistic preferences, using different models for their buy-side finance clients versus their corporate users.

Jack Kokko – Building the Google of Finance at AlphaSense (EP.461)

Capital Allocators – Inside the Institutional Investment Industry·5 months ago

Fuse Image and Text Vector Embeddings to Create Powerful Semantic Search

To move beyond keyword search in their media archive, Tim McLear's system generates two vector embeddings for each asset: one from the image thumbnail and another from its AI-generated text description. Fusing these enables a powerful semantic search that understands visual similarity and conceptual relationships, not just exact text matches.

“Nobody wanted to do this work”: How Emmy Award–winning filmmakers use AI to automate the tedious parts of documentaries

How I AI·3 months ago

LexisNexis Uses "Agentic AI" to Route Tasks to the Best-Performing LLM

Rather than relying on a single LLM, LexisNexis employs a "planning agent" that decomposes a complex legal query into sub-tasks. It then assigns each task (e.g., deep research, document drafting) to the specific LLM best suited for it, demonstrating a sophisticated, model-agnostic approach for enterprise AI.

LexisNexis CEO says the AI law era is already here

Decoder with Nilay Patel·4 months ago

The High Cost of Vector Search Creates an Economic Bottleneck for AI Products

AI's hunger for context is making search a critical but expensive component. As illustrated by Turbo Puffer's origin, a single recommendation feature using vector embeddings can cost tens of thousands per month, forcing companies to find cheaper solutions to make AI features economically viable at scale.

Sora 2 Launch Reactions, DoorDash CEO Live in The Ultradome | Tony Xu, Simon Eskildsen, Patrick O’Shaughnessy, Zach Abrams, Andrew Feldman, Brandon Millman, Stanley Tang, Alex Albert, Arthur Querou

TBPN·5 months ago

Anthropic's Claude Code Ditched Vector Search for More Accurate "Agentic Search"

While vector search is a common approach for RAG, Anthropic found it difficult to maintain and a security risk for enterprise codebases. They switched to "agentic search," where the AI model actively uses tools like grep or find to locate code, achieving similar accuracy with a cleaner deployment.

Inside Claude Code From the Engineers Who Built It

AI & I·4 months ago

AI's Value in Legal Tech Is Understanding Semantics, Not Just Keyword Searching

Unlike simple "Ctrl+F" searches, modern language models analyze and attribute semantic meaning to legal phrases. This allows platforms to track a single legal concept (like a "J.Crew blocker") even when it's phrased a thousand different ways across complex documents, enabling true market-wide quantification for the first time.

AI Can Tell Us Something About Credit Market Weakness

Odd Lots·3 months ago

Better Data Preparation, Not Vector Databases, Unlocks RAG System Performance

Teams often agonize over which vector database to use for their Retrieval-Augmented Generation (RAG) system. However, the most significant performance gains come from superior data preparation, such as optimizing chunking strategies, adding contextual metadata, and rewriting documents into a Q&A format.

Al Engineering 101 with Chip Huyen (Nvidia, Stanford, Netflix)

Lenny's Podcast: Product | Career | Growth·4 months ago

Enterprise Search Is Broken; A Universal Layer Is the Only Fix

The fragmentation of knowledge across 12-20 work apps renders individual search bars inefficient. A universal search tool like Dropbox Dash, which ingests and connects content from all sources, is necessary to restore productivity for knowledge workers.

951: Context Engineering, Multiplayer AI and Effective Search, with Dropbox’s Josh Clemm

Super Data Science: ML & AI Podcast with Jon Krohn·2 months ago

AI Companies Should Create Branded 'Composite Models' to Improve Performance and Decouple from Labs

Instead of offering a model selector, creating a proprietary, branded model allows a company to chain different specialized models for various sub-tasks (e.g., search, generation). This not only improves overall performance but also provides business independence from the pricing and launch cycles of a single frontier model lab.

⚡ Inside GitHub’s AI Revolution: Jared Palmer Reveals Agent HQ & The Future of Coding Agents

Latent Space: The AI Engineer Podcast·3 months ago