Structured output turns LLM generation into an API you can actually call from code. Modern providers (OpenAI, Anthropic, Google) all support some form of schema-guided decoding where the model is literally prevented from emitting tokens that would violate the schema.
Two modes: JSON mode (guarantees valid JSON but not your specific schema) and schema mode (guarantees your exact JSON Schema). Use schema mode whenever you have a concrete shape; JSON mode is a weaker fallback.
Example Prompt
Extract event information as JSON matching this schema:
{
"type": "object",
"properties": {
"title": {"type": "string"},
"start": {"type": "string", "format": "date-time"},
"end": {"type": "string", "format": "date-time"},
"location": {"type": "string"}
},
"required": ["title", "start"]
}
Text: "Team standup tomorrow at 9am for 30 minutes in the Oak conference room."When to use it
- You're calling the model from code that parses the output
- Output drives downstream actions (DB writes, API calls)
- You need consistent fields across many invocations
- Human review isn't in the loop for every response
When NOT to use it
- Output is meant to be read by humans as prose
- The schema would be so complex it limits the model's expressiveness
- The task is genuinely unstructured (creative writing, exploration)
