Tree of Thoughts (Yao et al., 2023) generalizes chain-of-thought. Instead of a single reasoning chain, the model generates several alternative steps at each decision point, scores them, and prunes. Implementation: loop over "propose N candidate next steps → score each → pick best (or keep top-k) → continue."
Effective on problems with branching search (puzzles, planning, game-like tasks). Expensive: easily 5-10x the tokens of CoT. In practice, most production uses are either (a) reasoning-model internal (they already do this), (b) multi-sample + vote (self-consistency is cheaper approximation), or (c) narrow where the branching payoff is huge.
Example Prompt
Task: Plan a 3-day hiking trip to Acadia in October. Budget $600.
Use Tree of Thoughts:
1. Generate 3 distinct trip archetypes (relaxed / ambitious / budget-focused).
2. For each, estimate feasibility against the budget and weather constraints.
3. Pick the strongest archetype and expand into a daily plan.
4. Critique the plan, propose 2 refinements, commit to one.When to use it
- Problems with genuine branching / search (puzzles, planning, code paths)
- Tasks where committing too early to one reasoning path is the failure mode
- Offline / non-latency-sensitive workloads
When NOT to use it
- Straightforward tasks where plain CoT works -- ToT is 5-10x the cost
- Production latency budgets
- Reasoning models that already explore internally
