MCP Glossary

Prompt Engineering

TL;DR

Prompt engineering is the practice of designing inputs (prompts) to LLMs to reliably produce desired outputs. It spans system prompts, few-shot examples, structured formats, tool descriptions, and chain-of-thought patterns. Good prompts make unreliable models reliable.

In depth

Prompt engineering is the art and science of getting consistent, high-quality output from an LLM. The same task can yield wildly different results depending on how it's phrased. Practitioners iterate on word choice, structure, examples, and format to minimize error rates.

Core techniques include: **system prompts** (stable instructions set once per session), **few-shot examples** (show the model 3-5 examples of desired output), **chain-of-thought** (ask the model to think step-by-step before answering), **XML / structured formats** (Claude responds well to `<thinking>` and `<answer>` tags), and **role-playing** ('You are a senior SRE...').

Tool descriptions inside MCP are a form of prompt engineering. A well-written tool description — clear purpose, when to use, argument semantics — dramatically increases the LLM's accuracy at picking the right tool. Bad descriptions cause tools to go unused or be called inappropriately.

Prompt engineering is not static — best practices evolve with model versions. Prompts optimized for GPT-3.5 often underperform on GPT-4o; Claude-specific patterns (XML tags) differ from GPT-specific patterns (markdown headers).

Examples

1
A system prompt: 'You are a senior SRE. Respond concisely.'
2
Few-shot examples showing the exact JSON format to return
3
Chain-of-thought: 'Think step by step before answering.'
4
Claude XML pattern: 'Wrap your analysis in <thinking>...</thinking>'
5
MCP tool descriptions that explain WHEN to call each tool

What it's NOT

✗Prompt engineering is NOT magic — it's systematic iteration based on evaluation.
✗Prompts are NOT universal — what works for GPT often fails for Claude, and vice versa.
✗Longer prompts are NOT always better — concise, well-structured prompts often outperform verbose ones.
✗Prompt engineering will NOT make a weak model strong — frontier models benefit most.

Frequently asked questions

Is prompt engineering still relevant?

Yes — frontier models are more forgiving, but prompts still matter enormously for agent reliability and cost.

How do I test a prompt?

Build an eval set of 20-50 input/output pairs. Measure accuracy, cost, and latency. Iterate until metrics are acceptable.

Does MCP require prompt engineering?

The tool descriptions inside MCP servers are a form of prompt engineering — they determine how well LLMs use your tools.

Build with MCP

Browse 300+ MCP servers, explore recipes, or continue learning the MCP vocabulary.

Browse Marketplace All terms

In depth

Examples

A system prompt: 'You are a senior SRE. Respond concisely.'

Few-shot examples showing the exact JSON format to return

Chain-of-thought: 'Think step by step before answering.'

Claude XML pattern: 'Wrap your analysis in <thinking>...</thinking>'

MCP tool descriptions that explain WHEN to call each tool

Frequently asked questions

Is prompt engineering still relevant?

Yes — frontier models are more forgiving, but prompts still matter enormously for agent reliability and cost.

How do I test a prompt?

Build an eval set of 20-50 input/output pairs. Measure accuracy, cost, and latency. Iterate until metrics are acceptable.

Does MCP require prompt engineering?

The tool descriptions inside MCP servers are a form of prompt engineering — they determine how well LLMs use your tools.

Prompt Engineering

TL;DR

In depth

Examples

What it's NOT

Related terms

See also

Frequently asked questions

Build with MCP

Prompt Engineering

TL;DR

In depth

Examples

What it's NOT

Related terms

See also

Frequently asked questions

Build with MCP