ReAct: Synergizing Reasoning and Acting in Language Models -- Qurtoo Papers

TL;DR

Introduced the Reason-Act-Observe loop: models that alternate between thinking and calling tools dramatically outperform models that only think OR only act. The founding paper of modern agentic LLM architectures.

Why it matters

Every LangChain agent, every Claude with tools, every "copilot" loop descends from this pattern. Before ReAct, tool-using models either hallucinated tool outputs or jumped straight to action without planning. ReAct showed that explicit reasoning before each action catches plan errors earlier and produces more reliable behavior.

Modern frameworks hide the Thought/Action/Observation labels behind function-calling APIs, but the underlying shape is unchanged.

How you'd use this

When designing a tool-using agent: structure the loop as reason → act → observe, at least implicitly. Expose the reasoning trace for debuggability. If you're using a reasoning model, much of this happens internally and you can skip explicit ReAct scaffolding.

Read the authors' abstract

We explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two.