Sandboxing AI Agents: Isolation Strategies for Safe Code Execution

When an agent needs to run code it generated, "only execute safe code" is a false premise — LLMs cannot reliably judge whether their own output is safe. The only viable approach: execute all code in an isolated environment, assuming every snippet is malicious.

Why Agents Need Sandboxing

Many teams first reaction is "we will make the agent generate safe code." That path fails for three reasons:

LLMs cannot reliably judge code safety: A seemingly harmless os.listdir() can be used for reconnaissance, and eval() is the root of all evil
Indirect injection can hijack code generation: Attackers use prompt injection to make agents generate malicious code rather than injecting it directly
Even legitimate needs can go wrong: Agent-generated code may contain infinite loops, memory leaks, or accidental file deletions

Sandboxing is not an "extra security layer" — it is the foundational prerequisite for agent code execution.

Three Sandboxing Approaches Compared

Approach	Isolation Strength	Startup Latency	Language Support	Complexity
Docker container	High	1-5s	All	Medium
WebAssembly	Medium-High	<100ms	Limited (Rust/C/Go/JS)	High
Process-level	Medium	<50ms	All	Low

Approach 1: Docker Container Sandbox

The most versatile option. Each code execution request spins up an isolated Docker container that is destroyed after completion.

import docker
import tempfile
import os

class DockerSandbox:
    def __init__(
        self,
        image: str = "python:3.12-slim",
        memory_limit: str = "128m",
        cpu_period: int = 100000,
        cpu_quota: int = 50000,  # 50% CPU
        timeout: int = 30,
        network_disabled: bool = True,
    ):
        self.client = docker.from_env()
        self.image = image
        self.memory_limit = memory_limit
        self.cpu_period = cpu_period
        self.cpu_quota = cpu_quota
        self.timeout = timeout
        self.network_disabled = network_disabled

    def execute(self, code: str, language: str = "python") -> dict:
        ext_map = {"python": ".py", "javascript": ".js", "go": ".go"}
        ext = ext_map.get(language, ".txt")
        with tempfile.NamedTemporaryFile(mode="w", suffix=ext, delete=False) as f:
            f.write(code)
            host_path = f.name

        try:
            container = self.client.containers.run(
                image=self.image,
                command=f"python /code/main{ext}" if language == "python" else f"node /code/main{ext}",
                volumes={host_path: {"bind": f"/code/main{ext}", "mode": "ro"}},
                mem_limit=self.memory_limit,
                memswap_limit=self.memory_limit,  # Disable swap
                cpu_period=self.cpu_period,
                cpu_quota=self.cpu_quota,
                network_disabled=self.network_disabled,
                read_only=True,  # Read-only filesystem
                tmpfs={"/tmp": "size=10m"},
                pids_limit=64,  # Prevent fork bombs
                detach=True,
                remove=True,
            )

            result = container.wait(timeout=self.timeout)
            stdout = container.logs(stdout=True, stderr=False).decode("utf-8", errors="replace")
            stderr = container.logs(stdout=False, stderr=True).decode("utf-8", errors="replace")

            return {
                "exit_code": result.get("StatusCode", -1),
                "stdout": stdout[:10000],
                "stderr": stderr[:10000],
                "timed_out": False,
            }
        except docker.errors.APIError as e:
            if "timed out" in str(e).lower():
                try:
                    container.kill()
                except Exception:
                    pass
                return {"exit_code": -1, "stdout": "", "stderr": "Execution timed out", "timed_out": True}
            return {"exit_code": -1, "stdout": "", "stderr": str(e), "timed_out": False}
        finally:
            os.unlink(host_path)

Critical security settings:

network_disabled=True — Fully block network access, preventing data exfiltration and remote code downloads
read_only=True — Read-only filesystem, preventing malicious file writes
mem_limit + memswap_limit — Cap memory usage, stopping memory bombs
pids_limit — Limit process count, preventing fork bombs
cpu_quota — Throttle CPU, preventing infinite loops from monopolizing resources

Approach 2: Process-Level Isolation (Low-Latency Scenarios)

For scenarios requiring minimal latency (e.g., online code assistants), process-level isolation is the practical choice.

import subprocess
import resource

class ProcessSandbox:
    def __init__(self, timeout: int = 10, max_memory_mb: int = 64):
        self.timeout = timeout
        self.max_memory = max_memory_mb * 1024 * 1024

    def execute(self, code: str) -> dict:
        import tempfile, os
        with tempfile.NamedTemporaryFile(mode="w", suffix=".py", delete=False) as f:
            safe_code = self._inject_safety(code)
            f.write(safe_code)
            path = f.name

        try:
            proc = subprocess.Popen(
                ["python", "-S", path],  # -S skips site packages
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
                preexec_fn=self._set_limits,
                env={"PATH": "/usr/bin:/bin"},
            )
            try:
                stdout, stderr = proc.communicate(timeout=self.timeout)
                return {
                    "exit_code": proc.returncode,
                    "stdout": stdout.decode("utf-8", errors="replace")[:10000],
                    "stderr": stderr.decode("utf-8", errors="replace")[:10000],
                    "timed_out": False,
                }
            except subprocess.TimeoutExpired:
                proc.kill()
                return {"exit_code": -1, "stdout": "", "stderr": "Timeout", "timed_out": True}
        finally:
            os.unlink(path)

    def _set_limits(self):
        """Set resource limits in the child process"""
        resource.setrlimit(resource.RLIMIT_AS, (self.max_memory, self.max_memory))
        resource.setrlimit(resource.RLIMIT_CPU, (self.timeout, self.timeout))
        resource.setrlimit(resource.RLIMIT_NOFILE, (10, 10))

    def _inject_safety(self, code: str) -> str:
        """Inject safety restrictions"""
        blocked_imports = ["os", "subprocess", "socket", "http", "urllib", "requests", "shutil"]
        restrictions = [
            "import sys",
            "__builtins__ = {k: v for k, v in __builtins__.items() if k not in ['exec', 'eval', 'compile', 'open', 'input']}",
        ]
        for mod in blocked_imports:
            restrictions.append(f"sys.modules['{mod}'] = None  # blocked")
        return "\n".join(restrictions) + "\n\n" + code

Advantage: Startup latency under 50ms, suitable for fast-feedback scenarios.

Limitation: Process-level isolation is less secure than containers. preexec_fn resource limits are not fully reliable in Python, and module blacklisting can be bypassed.

Approach 3: WebAssembly (Browser + Server)

WebAssembly provides a true least-privilege sandbox — WASM modules cannot access filesystem, network, or syscalls by default.

from wasmtime import Store, Module, Instance, WasiConfig

class WasmSandbox:
    def __init__(self, wasm_bytes: bytes):
        self.store = Store()
        self.module = Module(self.store.engine, wasm_bytes)

    def execute(self, input_data: str) -> str:
        wasi_config = WasiConfig()
        wasi_config.stdin_data = input_data.encode("utf-8")
        wasi_config.preopen_dir(".", "/sandbox", readonly=True)

        self.store.set_wasi(wasi_config)
        instance = Instance(self.store, self.module)

        start = instance.exports(self.store)["_start"]
        start(self.store)

        return self._read_stdout()

    def _read_stdout(self) -> str:
        pass

Advantage: Theoretically the most secure sandbox — WASM modules can only do what you explicitly authorize.

Limitation: Limited language support (requires compilation to WASM), not suitable for Python code needing the full standard library.

How to Choose

Requirement	Recommended Approach	Reason
Execute arbitrary Python code	Docker container	Strongest isolation, broadest language support
Online code assistant (low latency)	Process-level	Fast startup, acceptable weaker isolation
Specific algorithms (no I/O)	WebAssembly	Strongest security, instant startup
Frontend agent executing user code	WebAssembly (browser)	Zero server cost, natural isolation
Filesystem read/write needed	Docker + tmpfs	Provides temporary writable space in container

Common Mistakes

Mistake 1: "Static analysis is enough, no sandbox needed" Python's eval, exec, __import__, ctypes, and subprocess can all bypass static analysis. Even if you check import statements, dynamic __builtins__ manipulation catches you off guard. Sandboxing and static analysis complement each other; they are not substitutes.

Mistake 2: "Docker is secure by default" A default-configured Docker container is not secure. Without network_disabled, read_only, mem_limit, and pids_limit, code inside can mine cryptocurrency, scan internal networks, or attempt container escape. Security must be configured.

Mistake 3: "Timeouts are enough, resource limits are unnecessary" timeout=30 cannot stop memory bombs: a single line like [0] * 10**10 consumes all memory within seconds, triggering the OOM Killer and affecting other services on the host. Memory limits are mandatory.

Summary

Sandboxing is a foundational prerequisite for agent code execution, not an optional extra
Docker containers are the most versatile approach, but must be configured: disable networking, read-only filesystem, memory limits, process count limits
Process-level isolation suits low-latency scenarios but provides weaker security — combine with module blacklisting and resource limits
WebAssembly offers the strongest security guarantees but has limited language support
Timeouts alone cannot stop memory bombs and fork bombs — resource limits are mandatory

Prepared by AgentList. Explore more agent sandboxing projects in our directory.

Sandboxing AI Agents: Isolation Strategies for Safe Code Execution

Sandboxing AI Agents: Isolation Strategies for Safe Code Execution

Why Agents Need Sandboxing

Three Sandboxing Approaches Compared

Approach 1: Docker Container Sandbox

Approach 2: Process-Level Isolation (Low-Latency Scenarios)

Approach 3: WebAssembly (Browser + Server)

How to Choose

Common Mistakes

Summary

Projects in this article

ZeroBox

LLM Sandbox

Dify Sandbox

WebContainer

OpenSandbox