
Prompt Architecture for Managed Agent Runtimes: What Changes When You Stop Owning the Orchestration Loop
Anthropic's Managed Agents, Google ADK, and Microsoft Agent Framework 1.0 all abstract away the runtime -- your prompts need to catch up
You wrote a solid agent prompt. It handles retries, manages state, catches errors gracefully. Then you deploy it on a managed runtime and half of it becomes dead weight -- or worse, it fights the platform.
Three managed agent runtimes shipped within a single week in April 2026. Microsoft Agent Framework 1.0 on April 3. Anthropic Managed Agents on April 8. Google ADK has been evolving steadily through its Agent Platform and Agents CLI. All three pull the orchestration loop out of your hands. Container lifecycle, state persistence, tool discovery, error recovery -- the runtime owns those now.
Your prompts were written for a world where they controlled all of that. They don't anymore.
The Harness Ate Your Orchestration Logic
The industry term for this shift is "harness engineering." The core insight is simple: using the same model and the same prompt, one team watched their benchmark success rate jump from 42% to 78% by changing only the runtime environment. LangChain improved a coding agent by 13.7 points on Terminal Bench 2.0 by tweaking the harness alone. Microsoft's Azure SRE Agent went from 45% to 75% intent-met on novel incidents by switching to filesystem-based context engineering.
The prompt didn't change. The model didn't change. The environment around them changed.
This matters for prompt engineers because it redefines where your work starts and stops. When you owned the loop, your system prompt was the entire control surface. Now it's one component inside a larger harness that manages retries, context windows, tool registries, and session persistence on your behalf.
Phil Schmid at Hugging Face put it plainly: "Most agent failures aren't model failures -- they're context failures." Wrong documents retrieved, too much history crammed into the window, tool definitions missing. The prompt was fine. The harness wasn't.
What Your Prompts Used to Do (That the Runtime Does Now)
Here's the concrete shift. If you're writing prompts for managed runtimes the same way you wrote them for self-hosted loops, you're duplicating work that the platform already handles -- and sometimes contradicting it.
Retry logic. Your old prompt probably included something like "if the API call fails, wait and try again up to 3 times." Managed runtimes handle retries at the infrastructure level. Your prompt's job changes from specifying retry behavior to classifying errors so the runtime knows which retry strategy to use.
Tool schemas. You used to paste full JSON schemas for every tool into your system prompt. With MCP-based tool discovery, the runtime resolves available tools dynamically. All three runtimes support MCP. Your prompt references capabilities, not schemas.
Credentials. If your prompt ever mentioned API keys, tokens, or auth headers, that's now handled by MCP servers and credential gateways. Cequence launched "Agent Personas" on April 28 specifically for infrastructure-level privilege scoping across providers. Your prompts should be completely credential-agnostic.
Conversation state. You used to tell the agent how to track what it had done. Managed runtimes provide durable session logs. Anthropic's runtime exposes a getEvents() API. Your prompt defines what to remember, not the mechanics of remembering it.
Context window management. OpenAI's Codex team found that shorter system prompts (around 100 lines that point to deeper docs) outperform massive instruction dumps. The repository becomes the knowledge base. This directly contradicts the instinct to pack everything into the system prompt, and it's especially relevant when you don't control context windowing.
Before and After: Error Classification
This is where the change is most concrete. Here's a self-hosted prompt that manages its own error handling:
The Prompt (self-hosted loop):
You are a data processing agent. When calling the analytics API:
- If you get a 429 rate limit error, wait 30 seconds and retry up to 3 times
- If you get a 500 error, wait 60 seconds and retry up to 2 times
- If you get a 401 error, stop immediately and report the auth failure
- If you get a 404 error, log the missing resource and skip to the next item
- Track your retry count in your response and never exceed the limits above
That prompt is doing the runtime's job. Here's the managed runtime version:
The Prompt (managed runtime):
You are a data processing agent. When a tool call fails, classify the error:
- TRANSIENT: rate limits, timeouts, temporary server errors. The runtime will retry these automatically.
- PERSISTENT: missing resources, malformed data, unsupported operations. Log what failed and why, then continue with remaining work.
- CRITICAL: authentication failures, permission denials, data integrity violations. Stop execution and report the issue with full context.
Never retry on your own. Report the classification and let the runtime handle recovery.
Why This Works: The managed runtime already has retry logic with exponential backoff, circuit breakers, and timeout handling built in. Your prompt trying to manage retries will conflict with the platform's retry behavior. Instead, you teach the agent to classify failures so the runtime applies the right strategy. This separation of concerns means the same prompt works across different runtime configurations without modification.
Expected Output (on a 404 error):
Error classification: PERSISTENT Tool: analytics_api.get_report Resource: report_id=Q3-2026-west not found (404) Action: Skipping this report. Continuing with remaining 4 reports in the batch. No retry needed -- this resource does not exist in the current dataset.
Before and After: Tool Discovery
Self-hosted agents need every tool defined upfront:
The Prompt (self-hosted loop):
You have access to the following tools:
1. search_documents(query: string, limit: int) - Searches the document store
Returns: {results: [{id, title, snippet, score}]}
2. read_document(id: string) - Retrieves full document content
Returns: {content: string, metadata: {author, date, tags}}
3. update_record(table: string, id: string, fields: object) - Updates a database record
API Key: Use the ANALYTICS_KEY environment variable
Returns: {success: boolean, updated_fields: string[]}
On a managed runtime with MCP:
The Prompt (managed runtime):
You are a research and data management agent. You have access to document search, document retrieval, and database update capabilities through the connected tool servers.
Discover available tools before starting work. Use tool descriptions to select the right one for each step. Do not assume tool parameter names or response shapes -- read them from the tool schema at invocation time.
If a capability you need is not available in the current tool set, report what's missing rather than attempting a workaround.
Why This Works: MCP servers register their tools dynamically, including full parameter schemas and descriptions. The runtime resolves which tools are available at invocation time, and credentials are handled at the server level, not in the prompt. This means the same agent prompt works whether it's connected to one MCP server or five, and adding new tools doesn't require prompt changes.
Expected Output (on first invocation):
Discovered 3 available tools via connected servers: - document_store.search (query, limit, filters) - document_store.read (document_id)
- analytics_db.update_record (table, record_id, fields)Starting research task. Searching for quarterly reports matching the specified criteria.
Before and After: State Handoff
This one trips people up the most. Self-hosted agents track their own state:
The Prompt (self-hosted loop):
Maintain a JSON state object throughout execution:
{
"completed_steps": [],
"pending_steps": [],
"collected_data": {},
"error_log": []
}
Update this state after every action. If the conversation is interrupted,
use the last state object to resume.
On a managed runtime:
The Prompt (managed runtime):
The runtime maintains a persistent session log of all your actions and their results. You do not need to track execution state manually.
When resuming interrupted work, review the session history to determine:
1. What steps completed successfully
2. What step was in progress when execution stopped
3. Whether the in-progress step's side effects need cleanup before retrying
Focus your working memory on the current task's decision context, not on bookkeeping.
Why This Works: Managed runtimes persist session state across interruptions automatically. Anthropic's runtime dropped p50 time-to-first-token by roughly 60% partly because the agent doesn't waste tokens on state management. Your prompt asking the agent to maintain a JSON state object is now redundant work that consumes context window space for no benefit.
The "Map, Not Manual" Principle
OpenAI's Codex team discovered something counterintuitive: shorter system prompts that reference deeper documentation outperform long prompts that try to contain everything. About 100 lines of pointers beat 1,000 lines of instructions.
This maps directly to managed runtimes. You don't control context window management anymore. The runtime decides what fits. A compact prompt with references to discoverable resources (tool schemas via MCP, documentation in the repo, session history via the runtime API) gives the platform room to manage context effectively.
Martin Fowler's April 2026 framework breaks this into two categories. Guides are feedforward controls that steer the agent before it acts -- your prompt instructions, tool descriptions, structured output schemas. Sensors are feedback controls that help the agent self-correct after acting -- evaluation steps, runtime telemetry, session history review.
Your prompt should define the guides. The runtime provides the sensors.
The Security Layer You're Not Writing Anymore
One more thing that moves out of your prompts: credential management. All three runtimes support MCP for tool connectivity, and MCP gateways are becoming the security control plane.
This means your prompts should never reference API keys, tokens, or authentication mechanisms. Not because it's a security risk (though it is), but because the gateway handles credential injection, rotation, and revocation. If you need to revoke access to a tool, you do it at the gateway. Every dependent agent loses access instantly without touching a single prompt.
Gartner projects that 50% of AI agent deployment failures by 2030 will trace to insufficient governance and runtime enforcement. The managed runtime handles enforcement. Your prompt defines intent.
What This Means for Your Workflow
If you're building agents on managed runtimes today, audit your existing prompts for any of these patterns:
- Retry instructions with specific wait times or attempt counts
- Inline tool schemas with parameter definitions
- Credential references or API key handling
- Manual state tracking or JSON bookkeeping
- Conversation history summarization instructions
Strip them out. Replace them with error classification rules, capability descriptions (not schemas), and decision context that helps the agent make better choices about when to proceed and when to stop.
The skill hasn't disappeared. It's shifted from writing orchestration logic in natural language to writing intent and judgment rules that a runtime can execute well. That's a harder prompt to write, honestly. Telling an agent how to retry is straightforward. Teaching it how to classify whether a failure is worth retrying requires understanding the domain.
That's where prompt engineering still matters in 2026.
If your team is adapting agent prompts for managed runtimes and wants structured training on harness-aware prompt architecture, connect with Kief Studio on Discord or schedule a session.
Training
Want your team prompting like this?
Kief Studio runs hands-on prompt engineering workshops tailored to your stack and workflows.
Newsletter
Get techniques in your inbox.
New prompt engineering guides delivered weekly. No spam, unsubscribe anytime.
Subscribe
