Lexical Search (Keyword)
How it works: Statistical matching on exact terms and their frequencies (BM25). Strengths:- Low-latency on well-indexed corpora (actual latency depends on engine, index, and hardware)
- Excellent for exact matches and rare terms
- No model training required
- Interpretable results
- Misses synonyms (“car” ≠ “automobile”)
- Struggles with conceptual queries
- Language-specific (requires stemming/lemmatization)
- Legal document search (exact statute numbers)
- Code search (function names, error codes)
- Product SKU lookup
- Any domain with precise terminology
Semantic Search (Vector)
How it works: Converts text to vectors in semantic space; similar meaning = nearby vectors (Embeddings). Strengths:- Handles synonyms and paraphrasing
- Works across languages (multilingual models)
- Captures conceptual similarity
- No query engineering needed
- Slower than BM25 (100-500ms for large collections)
- Misses exact matches if semantically “boring”
- Black box (hard to debug why something matched)
- Requires GPU for large-scale indexing
- Customer support (intent-based)
- Research papers (conceptual queries)
- Multilingual search
- FAQ matching
Hybrid Search (Lexical + Semantic)
The Production Standard: Combine lexical (keyword) and semantic (vector) search. Why hybrid wins:- Catches exact matches lexical search excels at
- Catches semantic matches vector search excels at
- Often improves retrieval quality across diverse corpora (magnitude varies by dataset and metric)
- Commonly used in production systems
- Weighted fusion: Combine scores with learned weights
- Rank fusion: Merge ranked lists (Reciprocal Rank Fusion - RRF)
- Two-stage: Lexical first pass → semantic reranking
When to Use Each Strategy
| Query Type | Best Strategy | Example |
|---|---|---|
| Exact terminology | Lexical | ”ICD-10 code M54.5” (medical) |
| Product codes/IDs | Lexical | ”SKU-2847-B” |
| Conceptual question | Semantic | ”How do I improve sleep?” |
| Paraphrased intent | Semantic | ”Can’t sign in” → password reset |
| Mixed (most production) | Hybrid | ”Latest Python security updates” |
In Production
Cost Impact:- Lexical: ~$0.0001 per query (compute only, no API calls)
- Semantic: ~$0.001-0.01 per query (embedding API + vector DB)
- Hybrid: ~$0.002-0.015 per query (both methods)
- Lexical: typically lower latency at moderate scales
- Semantic: generally higher latency than lexical; depends on index type and hardware
- Hybrid: adds overhead; parallel execution helps
- Lexical alone: 70-75% relevant results
- Semantic alone: 72-78% relevant results
- Hybrid: 85-92% relevant results
Practical Exercise
Implement and compare all three approaches on the same dataset:Pseudocode