MCP Glossary

Vector Database

TL;DR

A vector database is a specialized store optimized for searching high-dimensional vectors by approximate nearest neighbor (ANN) algorithms. It's the storage layer for embeddings, powering semantic search, RAG, recommendation systems, and similarity matching in AI applications.

In depth

A vector database stores millions to billions of embeddings and answers similarity queries efficiently. The key challenge: brute-force cosine similarity over 100M vectors is too slow. Vector DBs use approximate nearest neighbor algorithms (HNSW, IVF, ScaNN) that sacrifice perfect accuracy for 100-1000x speed.

Category leaders in 2026: **Pinecone** (managed SaaS, most mature), **Weaviate** (open-source + cloud), **Qdrant** (open-source Rust), **Milvus** (open-source, enterprise-focused), **pgvector** (Postgres extension — good enough for most apps), **Redis Stack** (vector + KV + full-text in one engine).

Vector DBs also support **hybrid search** — combining vector similarity with keyword (BM25) or metadata filters. Hybrid usually outperforms pure vector for real-world RAG because users often mix semantic intent with specific filters ('find pages about X from 2024').

In MCP, vector DBs are typically exposed as a pair of tools: `upsert_documents` (add + embed) and `semantic_search` (query + return top-k). The agent calls these as needed during RAG workflows.

Examples

1
Pinecone — managed vector DB with autoscaling
2
Weaviate — open-source, supports hybrid search out of the box
3
Qdrant — Rust-based, fast and resource-efficient
4
pgvector on Supabase — Postgres + vectors, simple and cheap
5
Redis Stack — vectors + caching + full-text in one

What it's NOT

✗Vector DBs are NOT required for RAG — pgvector on Postgres handles millions of docs fine.
✗Vector DBs are NOT just for text — image, audio, code, and multimodal embeddings all work.
✗Vector DBs are NOT inherently better than full-text — hybrid often wins.
✗Vector DBs are NOT automatic — you still need to choose chunking, embedding model, and metric.

Frequently asked questions

Which vector DB should I choose?

Start with pgvector on your existing Postgres. Move to Pinecone or Weaviate when you hit scale limits (~100M vectors).

How much does Pinecone cost?

Free tier for hobby; production starts around $70/mo. Scales with stored vector count and query volume.

Can I do hybrid search?

Yes — Weaviate, Qdrant, and Elasticsearch all support hybrid (BM25 + vector). Often better than pure vector for real apps.

Build with MCP

Browse 300+ MCP servers, explore recipes, or continue learning the MCP vocabulary.

Browse Marketplace All terms

In depth

In MCP, vector DBs are typically exposed as a pair of tools: `upsert_documents` (add + embed) and `semantic_search` (query + return top-k). The agent calls these as needed during RAG workflows.

Frequently asked questions

Which vector DB should I choose?

Start with pgvector on your existing Postgres. Move to Pinecone or Weaviate when you hit scale limits (~100M vectors).

How much does Pinecone cost?

Free tier for hobby; production starts around $70/mo. Scales with stored vector count and query volume.

Can I do hybrid search?

Yes — Weaviate, Qdrant, and Elasticsearch all support hybrid (BM25 + vector). Often better than pure vector for real apps.

Vector Database

TL;DR

In depth

Examples

What it's NOT

Related terms

See also

Frequently asked questions

Build with MCP

Vector Database

TL;DR

In depth

Examples

What it's NOT

Related terms

See also

Frequently asked questions

Build with MCP