Sandboxing AI Agents: Isolation Strategies for Safe Code Execution

Comparing container, WebAssembly, and process-level isolation approaches, with practical code for safely executing agent-generated code.

AgentList Team · April 21, 2026
AI Agent沙箱代码执行Docker安全

Sandboxing AI Agents: Isolation Strategies for Safe Code Execution

When an agent needs to run code it generated, "only execute safe code" is a false premise — LLMs cannot reliably judge whether their own output is safe. The only viable approach: execute all code in an isolated environment, assuming every snippet is malicious.

Why Agents Need Sandboxing

Many teams first reaction is "we will make the agent generate safe code." That path fails for three reasons:

  • LLMs cannot reliably judge code safety: A seemingly harmless os.listdir() can be used for reconnaissance, and eval() is the root of all evil
  • Indirect injection can hijack code generation: Attackers use prompt injection to make agents generate malicious code rather than injecting it directly
  • Even legitimate needs can go wrong: Agent-generated code may contain infinite loops, memory leaks, or accidental file deletions

Sandboxing is not an "extra security layer" — it is the foundational prerequisite for agent code execution.

Three Sandboxing Approaches Compared

Approach Isolation Strength Startup Latency Language Support Complexity
Docker container High 1-5s All Medium
WebAssembly Medium-High <100ms Limited (Rust/C/Go/JS) High
Process-level Medium <50ms All Low

Approach 1: Docker Container Sandbox

The most versatile option. Each code execution request spins up an isolated Docker container that is destroyed after completion.

import docker
import tempfile
import os

class DockerSandbox:
    def __init__(
        self,
        image: str = "python:3.12-slim",
        memory_limit: str = "128m",
        cpu_period: int = 100000,
        cpu_quota: int = 50000,  # 50% CPU
        timeout: int = 30,
        network_disabled: bool = True,
    ):
        self.client = docker.from_env()
        self.image = image
        self.memory_limit = memory_limit
        self.cpu_period = cpu_period
        self.cpu_quota = cpu_quota
        self.timeout = timeout
        self.network_disabled = network_disabled

    def execute(self, code: str, language: str = "python") -> dict:
        ext_map = {"python": ".py", "javascript": ".js", "go": ".go"}
        ext = ext_map.get(language, ".txt")
        with tempfile.NamedTemporaryFile(mode="w", suffix=ext, delete=False) as f:
            f.write(code)
            host_path = f.name

        try:
            container = self.client.containers.run(
                image=self.image,
                command=f"python /code/main{ext}" if language == "python" else f"node /code/main{ext}",
                volumes={host_path: {"bind": f"/code/main{ext}", "mode": "ro"}},
                mem_limit=self.memory_limit,
                memswap_limit=self.memory_limit,  # Disable swap
                cpu_period=self.cpu_period,
                cpu_quota=self.cpu_quota,
                network_disabled=self.network_disabled,
                read_only=True,  # Read-only filesystem
                tmpfs={"/tmp": "size=10m"},
                pids_limit=64,  # Prevent fork bombs
                detach=True,
                remove=True,
            )

            result = container.wait(timeout=self.timeout)
            stdout = container.logs(stdout=True, stderr=False).decode("utf-8", errors="replace")
            stderr = container.logs(stdout=False, stderr=True).decode("utf-8", errors="replace")

            return {
                "exit_code": result.get("StatusCode", -1),
                "stdout": stdout[:10000],
                "stderr": stderr[:10000],
                "timed_out": False,
            }
        except docker.errors.APIError as e:
            if "timed out" in str(e).lower():
                try:
                    container.kill()
                except Exception:
                    pass
                return {"exit_code": -1, "stdout": "", "stderr": "Execution timed out", "timed_out": True}
            return {"exit_code": -1, "stdout": "", "stderr": str(e), "timed_out": False}
        finally:
            os.unlink(host_path)

Critical security settings:

  • network_disabled=True — Fully block network access, preventing data exfiltration and remote code downloads
  • read_only=True — Read-only filesystem, preventing malicious file writes
  • mem_limit + memswap_limit — Cap memory usage, stopping memory bombs
  • pids_limit — Limit process count, preventing fork bombs
  • cpu_quota — Throttle CPU, preventing infinite loops from monopolizing resources

Approach 2: Process-Level Isolation (Low-Latency Scenarios)

For scenarios requiring minimal latency (e.g., online code assistants), process-level isolation is the practical choice.

import subprocess
import resource

class ProcessSandbox:
    def __init__(self, timeout: int = 10, max_memory_mb: int = 64):
        self.timeout = timeout
        self.max_memory = max_memory_mb * 1024 * 1024

    def execute(self, code: str) -> dict:
        import tempfile, os
        with tempfile.NamedTemporaryFile(mode="w", suffix=".py", delete=False) as f:
            safe_code = self._inject_safety(code)
            f.write(safe_code)
            path = f.name

        try:
            proc = subprocess.Popen(
                ["python", "-S", path],  # -S skips site packages
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
                preexec_fn=self._set_limits,
                env={"PATH": "/usr/bin:/bin"},
            )
            try:
                stdout, stderr = proc.communicate(timeout=self.timeout)
                return {
                    "exit_code": proc.returncode,
                    "stdout": stdout.decode("utf-8", errors="replace")[:10000],
                    "stderr": stderr.decode("utf-8", errors="replace")[:10000],
                    "timed_out": False,
                }
            except subprocess.TimeoutExpired:
                proc.kill()
                return {"exit_code": -1, "stdout": "", "stderr": "Timeout", "timed_out": True}
        finally:
            os.unlink(path)

    def _set_limits(self):
        """Set resource limits in the child process"""
        resource.setrlimit(resource.RLIMIT_AS, (self.max_memory, self.max_memory))
        resource.setrlimit(resource.RLIMIT_CPU, (self.timeout, self.timeout))
        resource.setrlimit(resource.RLIMIT_NOFILE, (10, 10))

    def _inject_safety(self, code: str) -> str:
        """Inject safety restrictions"""
        blocked_imports = ["os", "subprocess", "socket", "http", "urllib", "requests", "shutil"]
        restrictions = [
            "import sys",
            "__builtins__ = {k: v for k, v in __builtins__.items() if k not in ['exec', 'eval', 'compile', 'open', 'input']}",
        ]
        for mod in blocked_imports:
            restrictions.append(f"sys.modules['{mod}'] = None  # blocked")
        return "\n".join(restrictions) + "\n\n" + code

Advantage: Startup latency under 50ms, suitable for fast-feedback scenarios.

Limitation: Process-level isolation is less secure than containers. preexec_fn resource limits are not fully reliable in Python, and module blacklisting can be bypassed.

Approach 3: WebAssembly (Browser + Server)

WebAssembly provides a true least-privilege sandbox — WASM modules cannot access filesystem, network, or syscalls by default.

from wasmtime import Store, Module, Instance, WasiConfig

class WasmSandbox:
    def __init__(self, wasm_bytes: bytes):
        self.store = Store()
        self.module = Module(self.store.engine, wasm_bytes)

    def execute(self, input_data: str) -> str:
        wasi_config = WasiConfig()
        wasi_config.stdin_data = input_data.encode("utf-8")
        wasi_config.preopen_dir(".", "/sandbox", readonly=True)

        self.store.set_wasi(wasi_config)
        instance = Instance(self.store, self.module)

        start = instance.exports(self.store)["_start"]
        start(self.store)

        return self._read_stdout()

    def _read_stdout(self) -> str:
        pass

Advantage: Theoretically the most secure sandbox — WASM modules can only do what you explicitly authorize.

Limitation: Limited language support (requires compilation to WASM), not suitable for Python code needing the full standard library.

How to Choose

Requirement Recommended Approach Reason
Execute arbitrary Python code Docker container Strongest isolation, broadest language support
Online code assistant (low latency) Process-level Fast startup, acceptable weaker isolation
Specific algorithms (no I/O) WebAssembly Strongest security, instant startup
Frontend agent executing user code WebAssembly (browser) Zero server cost, natural isolation
Filesystem read/write needed Docker + tmpfs Provides temporary writable space in container

Common Mistakes

Mistake 1: "Static analysis is enough, no sandbox needed" Python's eval, exec, __import__, ctypes, and subprocess can all bypass static analysis. Even if you check import statements, dynamic __builtins__ manipulation catches you off guard. Sandboxing and static analysis complement each other; they are not substitutes.

Mistake 2: "Docker is secure by default" A default-configured Docker container is not secure. Without network_disabled, read_only, mem_limit, and pids_limit, code inside can mine cryptocurrency, scan internal networks, or attempt container escape. Security must be configured.

Mistake 3: "Timeouts are enough, resource limits are unnecessary" timeout=30 cannot stop memory bombs: a single line like [0] * 10**10 consumes all memory within seconds, triggering the OOM Killer and affecting other services on the host. Memory limits are mandatory.

Summary

  • Sandboxing is a foundational prerequisite for agent code execution, not an optional extra
  • Docker containers are the most versatile approach, but must be configured: disable networking, read-only filesystem, memory limits, process count limits
  • Process-level isolation suits low-latency scenarios but provides weaker security — combine with module blacklisting and resource limits
  • WebAssembly offers the strongest security guarantees but has limited language support
  • Timeouts alone cannot stop memory bombs and fork bombs — resource limits are mandatory

Prepared by AgentList. Explore more agent sandboxing projects in our directory.