Problem
An agent that only drafts text can fail with little damage. An agent that executes code, queries private systems, installs dependencies, writes files, or touches infrastructure fails on another level.
The error is no longer a bad response. It is a side effect.
Many companies try to solve it with prompts: “don’t do anything dangerous”. That is not a control. It is a hope written in natural language.
Thesis
Sandboxed Work will be the new perimeter for agents that do real work.
The point is not to cage the model. The model does not execute. The point is to cage the action: filesystem, network, secrets, tools, processes, time, cost, and permissions.
Mature architecture separates brain, orchestrator, sandbox, and target systems.
Framework
An agentic sandbox must define five limits:
- Filesystem: what it can read, create, or modify.
- Network: which endpoints it may touch and from where.
- Secrets: which credentials are injected and for how long.
- Tools: which commands, APIs, or MCP servers it may invoke.
- Timebox: how long the work runs before the environment is killed.
Mini-case: a support agent reproduces a client bug. In a sandbox it can clone the repo, install dependencies, run tests, read sanitized logs, and propose a patch. On a shared machine it could contaminate the environment, leak secrets, or leave processes alive.
Measurable signal: percentage of agent actions with ephemeral environment, logs, and declared limits.
Stance: if the agent can execute, it must also be containable.
Why it matters now
Cloudflare has pushed sandboxes for agents as isolated and scalable environments. Anthropic has introduced Managed Agents with external sandboxes and MCP tunnels. AWS has taken MCP to cloud operations with IAM, CloudWatch, CloudTrail, and bounded execution.
The trend is not just “more agents”. It is agents with hands.
And when a system has hands, it needs gloves, a cleanroom, and movement logs.
Anti-example
“We have a shared runner for all agents.”
That mixes contexts, permissions, and execution residues. An exploration agent should not live in the same perimeter as an agent that touches production.
Protocol (3 steps)
- Classify actions by risk. Read, write, execute, private network, secrets, and production.
- Create sandboxes by class. Do not use the same environment for investigation, build, and sensitive systems.
- Destroy by default. The environment must expire, log, and clean up.
| Layer | Minimum control | Operational question |
|---|---|---|
| filesystem | isolated directory | what it can read |
| network | allowlist | where it can call |
| secrets | ephemeral tokens | how long they last |
| tools | permissions per tool | what it can do |
| time | timeout | when it dies |
Related
- Claude Managed Agents + Cloudflare: the perimeter becomes a product
- AWS MCP Server GA: when coding agents enter cloud with guardrails
- AI Traces: the layer that turns agents into auditable systems
Sources consulted
Next step
Draw an execution diagram of your most dangerous agent: model, orchestrator, sandbox, secrets, network, tools, logs, and target system. If you don’t know where the limit is, you still don’t have a perimeter.
Translated from the Spanish original with AI assistance and reviewed for accuracy. Read the original in Spanish.