In-context learning is what makes few-shot prompting work. The model isn't actually "learning" in the training sense -- it's pattern-matching from the examples in its context to produce similar-shaped output.
ICL is most of what LLM prompt engineering relies on. It works for classification, extraction, formatting, and translation. It works less well for tasks the model has no pretrained prior for -- and when ICL fails, the fix is usually either fine-tuning, retrieval augmentation, or a different model, not more examples.
Example Prompt
You're defining a made-up labeling task via in-context examples:
Input: "The soup was cold." Label: negative
Input: "Arrived on time." Label: positive
Input: "No complaints, shipped." Label: positive
Input: "Never buying again." Label: negative
Input: "Packaging was torn." Label:When to use it
- Task shape is consistent and you have 2-5 good example pairs
- The pattern is learnable from examples (classification, extraction)
- Fine-tuning isn't justified by volume or cost
When NOT to use it
- The task requires knowledge the model lacks (use RAG or fine-tuning)
- Examples would take more context than you have budget for
- Output is open-ended; examples over-constrain creativity
