Reference
Prompt Engineering Glossary
Every term you need, with working example prompts and practical notes on when each technique earns its keep.
10 terms in Development
Context Window
The maximum number of tokens (input + output combined) a model can process in a single request. Content past this limit is truncated or requires chunking.
DevelopmentEmbedding
A fixed-length vector representation of a piece of text (or image, audio, etc.) produced by an embedding model -- where semantic similarity maps to geometric proximity.
DevelopmentFine-Tuning
Additional training of a pretrained LLM on task-specific or domain-specific data -- updating the model's weights to specialize it, rather than prompting a generic model.
DevelopmentFunction Calling
A provider-specific API feature where the model returns a structured tool-call request (function name + JSON arguments) that your runtime executes and feeds back.
DevelopmentLoRA (Low-Rank Adaptation)
A parameter-efficient fine-tuning technique that freezes the base model and trains small low-rank adapter matrices alongside it -- cutting GPU memory and training cost by 10-100x.
DevelopmentModel Context Protocol (MCP)
An open protocol by Anthropic (2024+) for exposing tools, resources, and prompts from external servers to any LLM client, standardizing the integration layer.
DevelopmentTemperature
A sampling parameter (typically 0.0-2.0) that controls how deterministic vs. creative an LLM's output is. Lower = more predictable, higher = more varied.
DevelopmentToken
The atomic unit of input and output in an LLM. Not a word or a character -- a chunk produced by the model's tokenizer, roughly 3-4 characters of English.
DevelopmentTop-P (Nucleus Sampling)
An alternative to temperature: the model samples only from the smallest set of tokens whose cumulative probability exceeds P, ignoring everything below that threshold.
DevelopmentVector Database
A storage system optimized for similarity search over high-dimensional embeddings -- returning the K nearest neighbors of a query vector in sublinear time.
