Techniques

Self-Consistency

Generating N independent answers to the same prompt (with temperature > 0) and picking the majority answer -- trading compute for reliability.

First published April 14, 2026

Self-consistency (Wang et al., 2022) is the cheapest robustness technique. Sample the model 5-10 times, take the mode of the final answers. Errors from individual generations wash out; correct answers cluster.

Works best when there's a discrete final answer (a number, a label, a yes/no). Prose outputs don't "vote" cleanly. In 2026 practice: cheapest reliability upgrade for classification, math, and extraction tasks. Modern reasoning models blunt the gains somewhat because they already reason multiple ways internally.

Example Prompt

Call the API 5 times at temperature=0.7 with this same prompt:
"Classify the sentiment (positive/neutral/negative) of this review: {review}"

Then return the label that appears most often across the 5 calls. If tied, default to neutral.

When to use it

  • The final answer is a discrete value (number, label, enum)
  • You can afford 5-10x the cost for a meaningful accuracy bump
  • The task has known single-sample failure rate >5%

When NOT to use it

  • Prose / open-ended outputs where "most common answer" doesn't apply
  • Latency-sensitive paths
  • The base single-sample accuracy is already >99%