MCP Glossary

Embedding

Q: Can I use embeddings with MCP?

Yes — embedding + vector DB MCPs expose 'add document' and 'search' tools directly to the LLM agent.

TL;DR

An embedding is a dense vector (typically 384-3072 numbers) that represents the semantic meaning of a piece of text (or image, or code) in high-dimensional space. Embeddings enable similarity search — semantically close texts have close vectors — and are the foundation of vector databases and modern RAG pipelines.

In depth

An embedding model converts a variable-length input (a sentence, a document, a code snippet) into a fixed-size numerical vector. The magic: semantically similar inputs produce vectors close in cosine similarity, even if they share no keywords. 'How do I fix a flat tire?' and 'tire repair tutorial' end up near each other in embedding space.

Embeddings are the mechanism behind semantic search. To find the most relevant documents for a query, you embed the query, embed all your documents, and compute cosine similarity — the closest vectors are the most semantically relevant hits. This is how ChatGPT retrieves from your connected knowledge bases.

Popular embedding models: OpenAI `text-embedding-3-large` (3072 dim), Anthropic Voyage AI (`voyage-3`), Cohere `embed-v3`, open-weights like `bge-large` or `e5-mistral`. Dimension size is a trade-off: larger = more expressive, but more storage and slower search.

MCP ecosystem includes embedding-focused servers: Pinecone, Weaviate, pgvector-on-Postgres, Voyage. Combine with a knowledge-base MCP (Notion, Confluence) to build an end-to-end RAG pipeline.

Examples

1
OpenAI `text-embedding-3-small` — 1536 dim, cheap and fast
2
Voyage AI `voyage-3` — 1024 dim, optimized for retrieval quality
3
BGE-M3 — open-weights multilingual embedding
4
Semantic search: embed query + find top-k similar doc chunks
5
Code embeddings — find similar functions across a codebase

What it's NOT

✗Embeddings are NOT word-level — modern embeddings capture full-sentence or document meaning.
✗Embedding models are NOT the same as chat LLMs — different architecture, different purpose.
✗Cosine similarity is NOT the only distance metric — dot product and Euclidean also work.
✗Higher-dimensional embeddings are NOT always better — often diminishing returns beyond 1024 dim.

Frequently asked questions

Which embedding model should I use?

For production: Voyage-3 or OpenAI text-embedding-3-large. For open-weights: BGE-M3 or E5-Mistral.

How much do embeddings cost?

Cheap — OpenAI charges ~$0.02 per million tokens. Embedding 100K documents is typically <$5.

Can I use embeddings with MCP?

Yes — embedding + vector DB MCPs expose 'add document' and 'search' tools directly to the LLM agent.

Build with MCP

Browse 300+ MCP servers, explore recipes, or continue learning the MCP vocabulary.

Browse Marketplace All terms

In depth

MCP ecosystem includes embedding-focused servers: Pinecone, Weaviate, pgvector-on-Postgres, Voyage. Combine with a knowledge-base MCP (Notion, Confluence) to build an end-to-end RAG pipeline.

Frequently asked questions

Which embedding model should I use?

For production: Voyage-3 or OpenAI text-embedding-3-large. For open-weights: BGE-M3 or E5-Mistral.

How much do embeddings cost?

Cheap — OpenAI charges ~$0.02 per million tokens. Embedding 100K documents is typically <$5.

Can I use embeddings with MCP?

Yes — embedding + vector DB MCPs expose 'add document' and 'search' tools directly to the LLM agent.

Embedding

TL;DR

In depth

Examples

What it's NOT

Related terms

See also

Frequently asked questions

Build with MCP

Embedding

TL;DR

In depth

Examples

What it's NOT

Related terms

See also

Frequently asked questions

Build with MCP