TL;DR
Generalizes chain-of-thought by having the model explore multiple reasoning branches, score each, and prune. Dramatically better on puzzle-like problems (24, crosswords, creative writing with constraints) at the cost of 5-10x the tokens.
Why it matters
ToT is the reference point for any "search-over-reasoning" technique. It made clear that the bottleneck on hard reasoning wasn't model capability but the greedy linear reasoning chain. Modern reasoning models internalize this -- but if you're implementing an agent, ToT-style exploration remains a useful tool for branching search problems.
How you'd use this
Use ToT when the task has genuine branching structure (planning, game-like tasks, constrained optimization). For everyday reasoning, self-consistency (sample N, majority vote) is cheaper and nearly as good.
Read the authors' abstract
We introduce Tree of Thoughts, a new framework that generalizes over chain-of-thought prompting and enables exploration over coherent units of text that serve as intermediate steps toward problem solving.
