The supervisor handles reasoning and coordination. Workers handle execution. The economics: frontier model is ~20x the cost of a small one, so you want the frontier doing the hard thinking and cheap models doing the grunt work.
Concrete example: a customer-service supervisor (Claude 4 / GPT-5) decides what each ticket needs, then dispatches to workers -- one summarizes the ticket, one pulls the customer history, one drafts a response, one checks for compliance. Supervisor stitches the outputs together. 70-90% cost reduction vs supervisor doing all steps.
Example Prompt
You are a supervisor agent. Given a user request:
1. Decompose into 2-5 sub-tasks.
2. For each, choose a worker tool: `summarize`, `extract_entities`, `draft_response`, `check_policy`.
3. Specify the input each worker needs.
Output JSON of the form: [{worker, input}, ...]
Do NOT attempt any of the sub-tasks yourself.When to use it
- Pipelines with mixed complexity (some easy steps, some hard)
- Cost pressure where running everything on the supervisor is wasteful
- Tasks that parallelize across workers (huge latency win)
When NOT to use it
- Tasks that don't decompose cleanly -- splitting adds overhead with no benefit
- Workers are tiny enough that their errors dominate (quality regression)
- Single-step tasks where supervisor overhead beats the savings
