Home Glossary
HomeGlossaryLarge Language Model (LLM)
MCP Glossary

Large Language Model (LLM)

TL;DR

A Large Language Model (LLM) is a neural network trained on vast text corpora to predict the next token given a context. Modern LLMs like GPT-4, Claude 4, Gemini 2.5, and Llama 3 have billions to trillions of parameters and can reason, code, translate, summarize, and use tools — the foundation of the current AI wave.

In depth

An LLM is a statistical model that learns patterns in language by predicting the next token over massive training datasets (typically trillions of tokens scraped from the web, books, code, and more). Once trained, the model can generate text, answer questions, write code, and — with the right prompting — reason through complex problems.

Modern LLMs fall into two categories: **base models** (raw next-token predictors, e.g. Llama 3 base) and **instruction-tuned chat models** (post-trained with supervised fine-tuning + RLHF to follow instructions and be helpful, e.g. Claude, GPT-4). Most user-facing apps use chat models.

Leading models in 2026: Anthropic Claude (Opus, Sonnet, Haiku), OpenAI GPT (4o, 4.1, o1/o3 reasoning models), Google Gemini (2.5 Pro, Flash), Meta Llama (3.3, 4), Mistral, DeepSeek. Each has distinct strengths — Claude excels at writing and tool use; GPT-4 at coding; Gemini at long context; Llama at on-device.

MCP is model-agnostic. Any LLM that supports tool use (function calling) can consume MCP-provided tools via a compatible host. The server doesn't know or care which model is on the other end.

Examples

  • 1
    Anthropic Claude Opus 4 — frontier reasoning + tool use
  • 2
    OpenAI GPT-4o — multimodal (text + image + audio)
  • 3
    Google Gemini 2.5 Pro — 2M-token context, multimodal
  • 4
    Meta Llama 3.3 70B — open-weights, runs on-device via Ollama
  • 5
    Mistral Large — open-weights European frontier model

What it's NOT

  • ✗LLMs are NOT databases — they generate, they don't retrieve exact facts.
  • ✗LLMs are NOT reasoning engines in the classical sense — they pattern-match at scale.
  • ✗LLMs are NOT deterministic by default — same prompt often gives different outputs.
  • ✗LLMs are NOT all the same — capabilities, costs, and speeds vary by 10x+ across providers.

Related terms

AI AgentContext WindowFunction CallingEmbeddingPrompt Engineering

See also

  • Anthropic Claude
  • OpenAI Models

Frequently asked questions

Which LLM is best for MCP tool use?

Claude models are widely regarded as the most reliable at tool use. GPT-4o and Gemini 2.5 Pro are also strong.

Can I run an LLM locally?

Yes — Llama 3, Mistral, and Qwen run on consumer hardware via Ollama or LM Studio. Quality is behind frontier models but improving.

How do LLMs hallucinate?

They generate plausible-sounding but factually wrong output. RAG, tool use, and low temperature all reduce hallucination.

Build with MCP

Browse 300+ MCP servers, explore recipes, or continue learning the MCP vocabulary.

Browse MarketplaceAll terms