Tree of Thoughts: Deliberate Problem Solving with Large Language Models -- Qurtoo Papers

TL;DR

Generalizes chain-of-thought by having the model explore multiple reasoning branches, score each, and prune. Dramatically better on puzzle-like problems (24, crosswords, creative writing with constraints) at the cost of 5-10x the tokens.

Why it matters

ToT is the reference point for any "search-over-reasoning" technique. It made clear that the bottleneck on hard reasoning wasn't model capability but the greedy linear reasoning chain. Modern reasoning models internalize this -- but if you're implementing an agent, ToT-style exploration remains a useful tool for branching search problems.

How you'd use this

Use ToT when the task has genuine branching structure (planning, game-like tasks, constrained optimization). For everyday reasoning, self-consistency (sample N, majority vote) is cheaper and nearly as good.

Read the authors' abstract

We introduce Tree of Thoughts, a new framework that generalizes over chain-of-thought prompting and enables exploration over coherent units of text that serve as intermediate steps toward problem solving.