langchain - 💡(How to fix) Fix feat(community): Add exec-sandbox integration -- self-hosted hardware-isolated code execution (QEMU microVMs) [2 comments, 3 participants]

langchain2026-03-04 15:15:18

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#35555•Fetched 2026-04-08 00:25:38

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2mentioned ×2subscribed ×2closed ×1

LangChain's code execution story currently has a gap between local-but-limited (the now-archived langchain-sandbox / Pyodide) and cloud-managed sandboxes (E2B, Modal, Daytona, Runloop). There is no self-hosted option that provides hardware-level isolation, multi-language support, and production-grade security without requiring a cloud account or external API key.

exec-sandbox fills this gap. It runs each Python, JavaScript, or shell execution in a dedicated QEMU microVM with hardware virtualization (KVM on Linux, HVF on macOS). The VM boots, runs code, and is destroyed -- no state leaks between executions. Apache-2.0 licensed.

This issue proposes adding exec-sandbox as a LangChain community integration, covering three surfaces:

ExecSandboxTool -- a BaseTool subclass for ReAct agents
ExecSandboxBackend -- a SandboxBackendProtocol implementation for Deep Agents
exec_sandbox_eval -- an eval function for langgraph-codeact

Error Message

Full implementation with _run, _arun, lifecycle management, and error handling will be provided in the PR.

Root Cause

This issue proposes adding exec-sandbox as a LangChain community integration, covering three surfaces:

ExecSandboxTool -- a BaseTool subclass for ReAct agents
ExecSandboxBackend -- a SandboxBackendProtocol implementation for Deep Agents
exec_sandbox_eval -- an eval function for langgraph-codeact

Code Example

from langchain_core.tools import BaseTool
from pydantic import BaseModel, Field
from exec_sandbox import ExecutionResult, Scheduler, SchedulerConfig


class ExecSandboxInput(BaseModel):
    code: str = Field(..., description="The code to execute.")
    language: str = Field(default="python", description="'python', 'javascript', or 'raw' (shell).")
    packages: list[str] = Field(default_factory=list, description="Packages to install (e.g., ['pandas==2.2.0']).")


class ExecSandboxTool(BaseTool):
    """Execute code in a hardware-isolated QEMU microVM."""

    name: str = "exec_sandbox"
    description: str = (
        "Execute code in a secure, hardware-isolated sandbox (QEMU microVM). "
        "Supports Python, JavaScript/TypeScript, and shell commands. "
        "Returns stdout, stderr, and exit code."
    )
    args_schema: type[BaseModel] = ExecSandboxInput

    async def _arun(self, code: str, language: str = "python", packages: list[str] | None = None, **kwargs) -> str:
        scheduler = await self._ensure_scheduler()
        result = await scheduler.run(code=code, language=language, packages=packages or [],
                                     allow_network=self.allow_network, timeout_seconds=self.timeout_seconds)
        return _format_result(result)

---

from langchain.chat_models import init_chat_model
from langgraph.prebuilt import create_react_agent

model = init_chat_model("claude-sonnet-4-20250514", model_provider="anthropic")
tool = ExecSandboxTool(timeout_seconds=60, memory_mb=512)
agent = create_react_agent(model, [tool])

result = await agent.ainvoke({
    "messages": [{"role": "user", "content": "Calculate the first 20 Fibonacci numbers"}]
})

---

from deepagents.backends.protocol import ExecuteResponse, SandboxBackendProtocol
from exec_sandbox import Scheduler, SchedulerConfig, Session


class ExecSandboxBackend(SandboxBackendProtocol):
    """Self-hosted QEMU microVM sandbox for Deep Agents."""

    @property
    def id(self) -> str:
        return "exec-sandbox"

    async def aexecute(self, command: str, *, timeout: int | None = None) -> ExecuteResponse:
        session = await self._ensure_session()
        result = await session.exec(code=command, timeout_seconds=timeout or self._timeout_seconds)
        return ExecuteResponse(output=result.stdout or "", exit_code=result.exit_code, truncated=False)

---

from deepagents import create_deep_agent
from langchain.chat_models import init_chat_model
from langchain_exec_sandbox import ExecSandboxBackend

model = init_chat_model("claude-sonnet-4-20250514", model_provider="anthropic")
backend = ExecSandboxBackend(memory_mb=512, allow_network=True, allowed_domains=["api.github.com"])

agent = create_deep_agent(
    model=model,
    backend=backend,
    system_prompt="You are a software engineer. Use the sandbox to write, test, and debug code.",
)

---

from exec_sandbox import Scheduler

def create_exec_sandbox_eval(scheduler: Scheduler, timeout_seconds: int = 30) -> callable:
    """Create a CodeAct-compatible eval function backed by exec-sandbox."""
    def eval_fn(code: str, _locals: dict) -> tuple[str, dict]:
        result = asyncio.get_event_loop().run_until_complete(
            scheduler.run(code=code, language="python", timeout_seconds=timeout_seconds)
        )
        output = result.stdout or ""
        if result.stderr:
            output += f"\n[stderr] {result.stderr}"
        return output, {}
    return eval_fn

---

from exec_sandbox import Scheduler
from langgraph_codeact import create_codeact

async with Scheduler() as scheduler:
    eval_fn = create_exec_sandbox_eval(scheduler, timeout_seconds=60)
    agent = create_codeact(model, tools=[], eval_fn=eval_fn)

RAW_BUFFERClick to expand / collapse

Summary

This issue proposes adding exec-sandbox as a LangChain community integration, covering three surfaces:

ExecSandboxTool -- a BaseTool subclass for ReAct agents
ExecSandboxBackend -- a SandboxBackendProtocol implementation for Deep Agents
exec_sandbox_eval -- an eval function for langgraph-codeact

Motivation

The current landscape

Solution	Isolation	Self-hosted	Languages	Maintained
`langchain-sandbox` (Pyodide)	WASM + Deno	Yes	Python only	Archived (Jan 2026)
E2B	Firecracker VM	Complex (Terraform + Nomad)	Python, JS/TS, R, Java, Bash	Yes
Modal	gVisor (userspace kernel)	No (cloud only)	Python	Yes
Daytona	Docker / Kata / Sysbox	Yes (self-hosted via K8s, AGPL-3.0)	Multi	Yes
Runloop	Cloud sandbox	No (cloud API)	Multi	Yes
exec-sandbox	Hardware VM (KVM/HVF)	Yes (`pip install`)	Python, JS/TS, Shell	Yes

Why exec-sandbox complements existing integrations

Data sovereignty -- Code and data never leave the machine. Required for regulated industries and enterprise security policies that prohibit sending code to external services.
No cloud dependency -- pip install exec-sandbox + brew install qemu is the entire setup. No API key, no cloud account.
macOS development -- Develop and test locally on Mac (HVF) with the same isolation model as production Linux (KVM). E2B, Modal, Daytona, and Runloop require Linux/cloud for execution.
Hardware-level isolation -- Each execution runs in a dedicated QEMU microVM with its own kernel. The security model includes hardware virtualization (KVM/HVF), a hardened kernel (~360 subsystems stripped), EROFS read-only rootfs, seccomp, cgroups v2, namespaces, and unprivileged QEMU. See the security documentation for details.
Full runtime support -- Runs native CPython 3.14 / Bun 1.3 / Bash. Arbitrary pip install works (any package with a prebuilt musllinux wheel). Unlike Pyodide (~300+ compatible packages), the full ecosystem is available.

Both cloud-managed sandboxes (E2B, Modal, Daytona, Runloop) and exec-sandbox can coexist as LangChain integrations. Cloud sandboxes are ideal for teams that want managed infrastructure. exec-sandbox is for teams that need self-hosted execution with full control over the security boundary.

Proposed Implementations

1. `ExecSandboxTool` -- BaseTool for ReAct Agents

Follows the same pattern as E2BDataAnalysisTool in langchain_community.tools. Key difference: native async support (_arun).

from langchain_core.tools import BaseTool
from pydantic import BaseModel, Field
from exec_sandbox import ExecutionResult, Scheduler, SchedulerConfig


class ExecSandboxInput(BaseModel):
    code: str = Field(..., description="The code to execute.")
    language: str = Field(default="python", description="'python', 'javascript', or 'raw' (shell).")
    packages: list[str] = Field(default_factory=list, description="Packages to install (e.g., ['pandas==2.2.0']).")


class ExecSandboxTool(BaseTool):
    """Execute code in a hardware-isolated QEMU microVM."""

    name: str = "exec_sandbox"
    description: str = (
        "Execute code in a secure, hardware-isolated sandbox (QEMU microVM). "
        "Supports Python, JavaScript/TypeScript, and shell commands. "
        "Returns stdout, stderr, and exit code."
    )
    args_schema: type[BaseModel] = ExecSandboxInput

    async def _arun(self, code: str, language: str = "python", packages: list[str] | None = None, **kwargs) -> str:
        scheduler = await self._ensure_scheduler()
        result = await scheduler.run(code=code, language=language, packages=packages or [],
                                     allow_network=self.allow_network, timeout_seconds=self.timeout_seconds)
        return _format_result(result)

Full implementation with _run, _arun, lifecycle management, and error handling will be provided in the PR.

Usage with LangGraph ReAct agent:

from langchain.chat_models import init_chat_model
from langgraph.prebuilt import create_react_agent

model = init_chat_model("claude-sonnet-4-20250514", model_provider="anthropic")
tool = ExecSandboxTool(timeout_seconds=60, memory_mb=512)
agent = create_react_agent(model, [tool])

result = await agent.ainvoke({
    "messages": [{"role": "user", "content": "Calculate the first 20 Fibonacci numbers"}]
})

2. `ExecSandboxBackend` -- SandboxBackendProtocol for Deep Agents

Integrates with the Deep Agents sandbox system. The only required method is execute() -- all filesystem operations are built on top by BaseSandbox.

from deepagents.backends.protocol import ExecuteResponse, SandboxBackendProtocol
from exec_sandbox import Scheduler, SchedulerConfig, Session


class ExecSandboxBackend(SandboxBackendProtocol):
    """Self-hosted QEMU microVM sandbox for Deep Agents."""

    @property
    def id(self) -> str:
        return "exec-sandbox"

    async def aexecute(self, command: str, *, timeout: int | None = None) -> ExecuteResponse:
        session = await self._ensure_session()
        result = await session.exec(code=command, timeout_seconds=timeout or self._timeout_seconds)
        return ExecuteResponse(output=result.stdout or "", exit_code=result.exit_code, truncated=False)

Full implementation with lifecycle management, truncation handling, and execute() sync wrapper will be provided in the PR.

Usage with Deep Agents:

from deepagents import create_deep_agent
from langchain.chat_models import init_chat_model
from langchain_exec_sandbox import ExecSandboxBackend

model = init_chat_model("claude-sonnet-4-20250514", model_provider="anthropic")
backend = ExecSandboxBackend(memory_mb=512, allow_network=True, allowed_domains=["api.github.com"])

agent = create_deep_agent(
    model=model,
    backend=backend,
    system_prompt="You are a software engineer. Use the sandbox to write, test, and debug code.",
)

3. `exec_sandbox_eval` -- Eval Function for langgraph-codeact

The CodeAct architecture requires a sandbox function with signature (code: str, _locals: dict) -> tuple[str, dict]. The default eval() using Python's exec() is explicitly documented as unsafe for production.

from exec_sandbox import Scheduler

def create_exec_sandbox_eval(scheduler: Scheduler, timeout_seconds: int = 30) -> callable:
    """Create a CodeAct-compatible eval function backed by exec-sandbox."""
    def eval_fn(code: str, _locals: dict) -> tuple[str, dict]:
        result = asyncio.get_event_loop().run_until_complete(
            scheduler.run(code=code, language="python", timeout_seconds=timeout_seconds)
        )
        output = result.stdout or ""
        if result.stderr:
            output += f"\n[stderr] {result.stderr}"
        return output, {}
    return eval_fn

Full implementation with variable serialization across turns and a session-backed stateful variant will be provided in the PR.

Usage with CodeAct:

from exec_sandbox import Scheduler
from langgraph_codeact import create_codeact

async with Scheduler() as scheduler:
    eval_fn = create_exec_sandbox_eval(scheduler, timeout_seconds=60)
    agent = create_codeact(model, tools=[], eval_fn=eval_fn)

Performance

Path	Latency
Warm pool hit	1-2ms
L1 memory snapshot	~100ms
Cold boot	~400ms boot + interpreter startup

See the benchmarks documentation for throughput numbers under load.

Prior Art

langchain-sandbox -- archived Jan 2026, Pyodide/Deno sandbox
E2BDataAnalysisTool -- cloud-managed sandbox (BaseTool pattern)
Deep Agents sandboxes -- SandboxBackendProtocol with Modal, Daytona, Runloop
langgraph-codeact -- CodeAct architecture
exec-sandbox -- Apache-2.0, PyPI

Environment

exec-sandbox: latest (PyPI)
langchain-core: >=0.3
Python: 3.12+
QEMU: 8.0+
Platforms: macOS (HVF), Linux (KVM)

extent analysis

Problem Summary

Integrate exec-sandbox as a LangChain community integration to provide a self-hosted, hardware-isolated sandbox for code execution.

Root Cause Analysis

The current LangChain code execution story has a gap between local-but-limited sandboxes (e.g., langchain-sandbox) and cloud-managed sandboxes (e.g., E2B, Modal, Daytona, Runloop). exec-sandbox fills this gap by providing a self-hosted, hardware-isolated sandbox for code execution.

Fix Plan

1. `ExecSandboxTool` -- BaseTool for ReAct Agents

Implement the ExecSandboxTool class, which follows the same pattern as E2BDataAnalysisTool in langchain_community.tools.
Add native async support (_arun) to the ExecSandboxTool class.
Implement the args_schema property to define the input schema for the tool.
Implement the _arun method to execute code in the sandbox.

class ExecSandboxTool(BaseTool):
    # ...

    async def _arun(self, code: str, language: str = "python", packages: list[str] | None = None, **kwargs) -> str:
        scheduler = await self._ensure_scheduler()
        result = await scheduler.run(code=code, language=language, packages=packages or [],
                                     allow_network=self.allow_network, timeout_seconds=self.timeout_seconds)
        return _format_result(result)

2. `ExecSandboxBackend` -- SandboxBackendProtocol for Deep Agents

Implement the ExecSandboxBackend class, which integrates with the Deep Agents sandbox system.
Implement the execute method to execute code in the sandbox.

class ExecSandboxBackend(SandboxBackendProtocol):
    # ...

    async def aexecute(self, command: str, *, timeout

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

langchain - 💡(How to fix) Fix feat(community): Add exec-sandbox integration -- self-hosted hardware-isolated code execution (QEMU microVMs) [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Motivation

The current landscape

Why exec-sandbox complements existing integrations

Proposed Implementations

1. `ExecSandboxTool` -- BaseTool for ReAct Agents

2. `ExecSandboxBackend` -- SandboxBackendProtocol for Deep Agents

3. `exec_sandbox_eval` -- Eval Function for langgraph-codeact

Performance

Prior Art

Environment

extent analysis

Problem Summary

Root Cause Analysis

Fix Plan

1. `ExecSandboxTool` -- BaseTool for ReAct Agents

2. `ExecSandboxBackend` -- SandboxBackendProtocol for Deep Agents

Still need to ship something?

TRENDING

langchain - 💡(How to fix) Fix feat(community): Add exec-sandbox integration -- self-hosted hardware-isolated code execution (QEMU microVMs) [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Motivation

The current landscape

Why exec-sandbox complements existing integrations

Proposed Implementations

1. ExecSandboxTool -- BaseTool for ReAct Agents

2. ExecSandboxBackend -- SandboxBackendProtocol for Deep Agents

3. exec_sandbox_eval -- Eval Function for langgraph-codeact

Performance

Prior Art

Environment

extent analysis

Problem Summary

Root Cause Analysis

Fix Plan

1. ExecSandboxTool -- BaseTool for ReAct Agents

2. ExecSandboxBackend -- SandboxBackendProtocol for Deep Agents

Still need to ship something?

RELATED_DISCOVERY

TRENDING

1. `ExecSandboxTool` -- BaseTool for ReAct Agents

2. `ExecSandboxBackend` -- SandboxBackendProtocol for Deep Agents

3. `exec_sandbox_eval` -- Eval Function for langgraph-codeact

1. `ExecSandboxTool` -- BaseTool for ReAct Agents

2. `ExecSandboxBackend` -- SandboxBackendProtocol for Deep Agents