Techniques

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou

Published January 28, 2022 arXiv: 2201.11903 View on arXiv → PDF →

TL;DR

Appending "let's think step by step" or showing worked-example reasoning in the prompt dramatically improves LLM accuracy on math and multi-step problems. The paper that named and formalized chain-of-thought. Still the cited reference for CoT despite being from 2022.

Why it matters

The practical foundation under every reasoning-heavy LLM deployment today. Even reasoning-tuned 2026 models (Claude 4.6, GPT-5, o-series) owe their internal thinking behavior to the CoT family of techniques.

For practitioners, the lesson isn't "add let's think step by step" -- it's that explicit intermediate reasoning can be engineered, measured, and improved, and that non-reasoning models benefit enormously from it.

How you'd use this

If you're using a non-reasoning model (Haiku, GPT-4o-mini, smaller open-weight), always enable CoT for math, logic, and multi-hop Q&A. If you're using a reasoning model, CoT in the prompt is redundant -- the model is already doing it.

Read the authors' abstract

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning.