Agentic RAG -- Qurtoo Glossary

Classic RAG: one query, one retrieval, one answer. Agentic RAG: model drives the retrieval loop. If the first batch doesn't cover the question, it issues a new query. If the retrieved doc is low-quality, it skips. It can call multiple retrieval backends, combine results, and decide when it has enough.

Requires tool-using models and a retrieval tool exposed via function calling. More flexible than one-shot RAG; more expensive, and harder to reason about. Usually earns its keep on ambiguous or multi-part questions where a single retrieval can't cover the scope.

Example Prompt

You have retrieval tools: search_docs(query), search_faq(query),
search_knowledge_base(query). The user will ask a question.

You can retrieve as many times as you need. For each question:
1. Plan which tool(s) are most likely to have the answer.
2. Query them.
3. If results don't cover the question, reformulate and try again.
4. When you have enough context, answer grounded in the retrieved docs.
5. Cite each factual claim.

Do NOT answer from your own training data.

When to use it

Multi-part questions where one retrieval won't cover
Ambiguous queries that benefit from reformulation
Multiple knowledge sources with different content types

When NOT to use it

Simple direct queries -- classic RAG is simpler and faster
Cost-sensitive paths (agentic loops burn tokens)
You can't observe / debug the retrieval trace at runtime