Research

Papers

A curated digest of recent prompt-engineering, agentic, and AI-security research. Each paper: a 3-sentence TL;DR, why it matters for practitioners, and how to put it to work.

3 papers in Techniques

Techniques May 2023 arXiv: 2305.10601

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Generalizes chain-of-thought by having the model explore multiple reasoning branches, score each, and prune. Dramatically better on puzzle-like problems (24, crosswords, creative writing with constraints) at the cost of 5-10x the tokens.

Shunyu Yao, Dian Yu, Jeffrey Zhao +4
Techniques Mar 2022 arXiv: 2307.11760

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Sample the model N times at nonzero temperature, take the mode of the final answers. Cheapest known reliability upgrade for discrete-answer tasks -- easily doubles accuracy on math reasoning at the cost of N-x compute.

Xuezhi Wang, Jason Wei, Dale Schuurmans +5
Techniques Jan 2022 arXiv: 2201.11903

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Appending "let's think step by step" or showing worked-example reasoning in the prompt dramatically improves LLM accuracy on math and multi-step problems. The paper that named and formalized chain-of-thought. Still the cited reference for CoT despite being from 2022.

Jason Wei, Xuezhi Wang, Dale Schuurmans +6

Papers

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models