In depth
A vector database stores millions to billions of embeddings and answers similarity queries efficiently. The key challenge: brute-force cosine similarity over 100M vectors is too slow. Vector DBs use approximate nearest neighbor algorithms (HNSW, IVF, ScaNN) that sacrifice perfect accuracy for 100-1000x speed.
Category leaders in 2026: **Pinecone** (managed SaaS, most mature), **Weaviate** (open-source + cloud), **Qdrant** (open-source Rust), **Milvus** (open-source, enterprise-focused), **pgvector** (Postgres extension — good enough for most apps), **Redis Stack** (vector + KV + full-text in one engine).
Vector DBs also support **hybrid search** — combining vector similarity with keyword (BM25) or metadata filters. Hybrid usually outperforms pure vector for real-world RAG because users often mix semantic intent with specific filters ('find pages about X from 2024').
In MCP, vector DBs are typically exposed as a pair of tools: `upsert_documents` (add + embed) and `semantic_search` (query + return top-k). The agent calls these as needed during RAG workflows.