crewai - ✅(Solved) Fix Feature: Add QEMU microVM execution strategy (exec-sandbox) as Docker alternative for CodeInterpreterTool [1 pull requests, 1 comments, 2 participants]

crewai2026-03-04 15:15:09

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

crewAIInc/crewAI#4702•Fetched 2026-04-08 00:40:33

View on GitHub

Comments

Participants

Timeline

Reactions

Author

clemlesne

Participants

clemlesne

greysonlalonde

Timeline (top)

closed ×1commented ×1cross-referenced ×1mentioned ×1

Error Message

return f"Error (exit code {result.exit_code}):\n{result.stderr}"

Fix Action

Fix / Workaround

Internally, CodeInterpreterTool would gain a run_code_in_microvm() method parallel to the existing run_code_in_docker() and run_code_in_restricted_sandbox(), dispatched from _run().

PR fix notes

PR #4736: feat: add microvm code execution mode via exec-sandbox (#4702)

Repository: crewAIInc/crewAI
Author: yuweuii
State: closed | merged: False
Link: https://github.com/crewAIInc/crewAI/pull/4736

Description (problem / solution / changelog)

Summary

This PR implements issue #4702.

Scope: Feature: Add QEMU microVM execution strategy (exec-sandbox) as Docker alternative for CodeInterpreterTool
Source branch: yuweuii:codex/issue-4702
Commit: 1b5a5022

Linked Issue

Closes #4702

[!NOTE] Medium Risk Introduces a new code-execution backend and changes when Docker validation runs, which could affect safety/isolation and runtime behavior in production environments.

Overview Adds a new code_execution_mode="microvm" option for agents and execution_mode="microvm" for CodeInterpreterTool, dispatching Python execution through exec-sandbox (QEMU microVM) instead of Docker or host execution.

Updates agent/tool wiring so Agent.get_code_execution_tools() configures the interpreter with execution_mode vs unsafe_mode, and Docker installation validation now runs only for code_execution_mode="safe". Includes new microVM execution implementation (async scheduler wrapper), tests for the new dispatch/path, and documentation updates describing the new mode and installation requirements.

<sup>Written by Cursor Bugbot for commit d520f161512c50d1052f84314658832bffda0e5d. This will update automatically on new commits. Configure here.</sup>

Changed files

docs/en/concepts/agents.mdx (modified, +5/-4)
docs/en/tools/ai-ml/codeinterpretertool.mdx (modified, +34/-12)
lib/crewai-tools/src/crewai_tools/tools/code_interpreter_tool/code_interpreter_tool.py (modified, +70/-1)
lib/crewai-tools/tests/tools/test_code_interpreter_tool.py (modified, +63/-1)
lib/crewai/src/crewai/agent/core.py (modified, +15/-5)
lib/crewai/src/crewai/cli/templates/AGENTS.md (modified, +1/-1)
lib/crewai/src/crewai/project/crew_base.py (modified, +1/-1)
lib/crewai/tests/test_crew.py (modified, +53/-0)

Code Example

from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from exec_sandbox import Scheduler

class ExecSandboxInput(BaseModel):
    code: str = Field(..., description="Python 3 code to execute")
    packages: list[str] = Field(
        default_factory=list,
        description="pip packages to install before execution (e.g. ['pandas==2.2.0'])",
    )

class ExecSandboxTool(BaseTool):
    name: str = "Secure Code Interpreter"
    description: str = (
        "Executes Python 3 code in a hardware-isolated VM sandbox. "
        "Each execution gets a fresh VM that is destroyed after. "
        "Use for computations, data analysis, or any code that needs "
        "to run securely. Always end with a print() statement for output."
    )
    args_schema: type[BaseModel] = ExecSandboxInput

    def _run(self, code: str, packages: list[str] | None = None) -> str:
        import asyncio

        async def _execute():
            async with Scheduler() as scheduler:
                result = await scheduler.run(
                    code=code,
                    language="python",
                    packages=packages or [],
                    timeout_seconds=60,
                )
                if result.exit_code != 0:
                    return f"Error (exit code {result.exit_code}):\n{result.stderr}"
                return result.stdout

        return asyncio.run(_execute())

# Usage with a CrewAI agent
agent = Agent(
    role="Data Analyst",
    goal="Analyze data and produce insights",
    backstory="Expert data analyst with strong Python skills",
    tools=[ExecSandboxTool()],
)

---

agent = Agent(
    role="Data Analyst",
    goal="Analyze data and produce insights",
    backstory="Expert data analyst",
    allow_code_execution=True,
    code_execution_mode="microvm",  # New mode: QEMU microVM via exec-sandbox
)

RAW_BUFFERClick to expand / collapse

Problem

CrewAI's CodeInterpreterTool exposes two execution modes, each with different code paths:

Safe mode (default) -- tries Docker container execution (recommended), automatically falls back to a restricted sandbox when Docker is unavailable. The sandbox is described as "very limited" with strict restrictions on many modules and built-in functions.
Unsafe mode -- executes directly on the host, explicitly not recommended for production

This leaves a real gap. Users who cannot or prefer not to run Docker (CI environments, macOS enterprise license constraints, Docker-in-Docker headaches per #3028, or containerized deployments) are stuck choosing between a severely restricted sandbox and running untrusted LLM-generated code directly on their host machine. The community forum thread and issue #1983 show this is a recurring pain point.

Proposal

Add exec-sandbox (pip install exec-sandbox) as a 4th execution strategy -- hardware-isolated QEMU microVMs that provide stronger isolation than Docker containers without requiring a Docker daemon.

What exec-sandbox provides:

	Docker (current)	exec-sandbox (proposed)
Isolation	Container (shared kernel)	Hardware VM (KVM/HVF, own kernel)
Daemon required	Yes (Docker Desktop)	No (just QEMU binary)
Docker Desktop license cost	Paid for orgs with 250+ employees or $10M+ revenue	Free (Apache-2.0 + QEMU GPL)
Warm start latency	Container startup (~1s)	1-2ms (pre-booted VM pool)
Cold start latency	Image pull + boot	~100ms (L1 memory snapshot)
Languages	Python (current impl)	Python, JavaScript, RAW
State leakage	Possible (shared layers)	None (fresh VM per execution, destroyed after)
Docker-in-Docker	Problematic (#3028)	N/A (no daemon)
Network control	Manual config	Disabled by default, domain allowlisting
File I/O	Mounts host CWD	Explicit upload/download (no host filesystem exposure)

How integration could work

Option A: As a custom CrewAI Tool (works today, no core changes needed)

from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from exec_sandbox import Scheduler

class ExecSandboxInput(BaseModel):
    code: str = Field(..., description="Python 3 code to execute")
    packages: list[str] = Field(
        default_factory=list,
        description="pip packages to install before execution (e.g. ['pandas==2.2.0'])",
    )

class ExecSandboxTool(BaseTool):
    name: str = "Secure Code Interpreter"
    description: str = (
        "Executes Python 3 code in a hardware-isolated VM sandbox. "
        "Each execution gets a fresh VM that is destroyed after. "
        "Use for computations, data analysis, or any code that needs "
        "to run securely. Always end with a print() statement for output."
    )
    args_schema: type[BaseModel] = ExecSandboxInput

    def _run(self, code: str, packages: list[str] | None = None) -> str:
        import asyncio

        async def _execute():
            async with Scheduler() as scheduler:
                result = await scheduler.run(
                    code=code,
                    language="python",
                    packages=packages or [],
                    timeout_seconds=60,
                )
                if result.exit_code != 0:
                    return f"Error (exit code {result.exit_code}):\n{result.stderr}"
                return result.stdout

        return asyncio.run(_execute())

# Usage with a CrewAI agent
agent = Agent(
    role="Data Analyst",
    goal="Analyze data and produce insights",
    backstory="Expert data analyst with strong Python skills",
    tools=[ExecSandboxTool()],
)

Option B: As a native CodeInterpreterTool strategy (requires core changes)

This would add "microvm" as a new code_execution_mode alongside "safe" and "unsafe":

agent = Agent(
    role="Data Analyst",
    goal="Analyze data and produce insights",
    backstory="Expert data analyst",
    allow_code_execution=True,
    code_execution_mode="microvm",  # New mode: QEMU microVM via exec-sandbox
)

Internally, CodeInterpreterTool would gain a run_code_in_microvm() method parallel to the existing run_code_in_docker() and run_code_in_restricted_sandbox(), dispatched from _run().

Why not just use Docker?

Docker works well for many users, and this proposal does not replace it. But there are legitimate cases where Docker is not viable:

Enterprise macOS teams: Docker Desktop requires a paid subscription for organizations with 250+ employees or $10M+ revenue. QEMU is free.
CI/CD and containerized deployments: Running Docker-inside-Docker is fragile and requires privileged containers or socket mounting (#3028). QEMU microVMs need no daemon.
Stronger isolation needs: Containers share the host kernel. A kernel exploit in a container can compromise the host. MicroVMs run their own kernel with hardware virtualization.
Restricted sandbox is too restricted: The current fallback blocks many standard modules (os, sys, subprocess, tempfile, etc.), making it impractical for real data analysis or file processing tasks.

About exec-sandbox

GitHub: dualeai/exec-sandbox -- Apache-2.0 license
PyPI: exec-sandbox
How it works: Each execution boots a lightweight QEMU microVM (or grabs one from a warm pool in 1-2ms), runs code via a Rust guest-agent, streams stdout/stderr back, and destroys the VM. No state persists between executions.
Platforms: macOS (HVF) + Linux (KVM)
Languages: Python, JavaScript, RAW
Features: Package installation with snapshot caching, file I/O, streaming output, network domain filtering, port forwarding, sessions for stateful multi-step workflows

extent analysis

Fix Plan

To integrate exec-sandbox as a new execution strategy, follow these steps:

Install exec-sandbox using pip: pip install exec-sandbox
Implement a custom ExecSandboxTool class that inherits from BaseTool
Define the ExecSandboxInput model to handle code and package installation
Implement the _run method to execute code in a QEMU microVM using exec-sandbox

Example Code

from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from exec_sandbox import Scheduler

class ExecSandboxInput(BaseModel):
    code: str = Field(..., description="Python 3 code to execute")
    packages: list[str] = Field(
        default_factory=list,
        description="pip packages to install before execution (e.g. ['pandas==2.2.0'])",
    )

class ExecSandboxTool(BaseTool):
    name: str = "Secure Code Interpreter"
    description: str = (
        "Executes Python 3 code in a hardware-isolated VM sandbox. "
        "Each execution gets a fresh VM that is destroyed after. "
        "Use for computations, data analysis, or any code that needs "
        "to run securely. Always end with a print() statement for output."
    )
    args_schema: type[BaseModel] = ExecSandboxInput

    def _run(self, code: str, packages: list[str] | None = None) -> str:
        import asyncio

        async def _execute():
            async with Scheduler() as scheduler:
                result = await scheduler.run(
                    code=code,
                    language="python",
                    packages=packages or [],
                    timeout_seconds=60,
                )
                if result.exit_code != 0:
                    return f"Error (exit code {result.exit_code}):\n{result.stderr}"
                return result.stdout

        return asyncio.run(_execute())

# Usage with a CrewAI agent
agent = Agent(
    role="Data Analyst",
    goal="Analyze data and produce insights",
    backstory="Expert data analyst with strong Python skills",
    tools=[ExecSandboxTool()],
)

Verification

To verify the fix, create a test agent with the ExecSandboxTool and execute a sample code snippet. Check the output to ensure it matches the expected result.

Extra Tips

Ensure you have the necessary dependencies installed, including exec-sandbox and qemu.
Configure the ExecSandboxTool to use the correct language and package installation settings.
Test the tool with different code snippets and edge cases to ensure it works as expected.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #cache error #pipeline error #runtime error #dependency conflict

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

crewai - ✅(Solved) Fix Feature: Add QEMU microVM execution strategy (exec-sandbox) as Docker alternative for CodeInterpreterTool [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

PR fix notes

PR #4736: feat: add microvm code execution mode via exec-sandbox (#4702)

Description (problem / solution / changelog)

Summary

Linked Issue

Changed files

Code Example

Problem

Proposal

How integration could work

Option A: As a custom CrewAI Tool (works today, no core changes needed)

Option B: As a native CodeInterpreterTool strategy (requires core changes)

Why not just use Docker?

About exec-sandbox

extent analysis

Fix Plan

Example Code

Verification

Extra Tips

Still need to ship something?

TRENDING

crewai - ✅(Solved) Fix Feature: Add QEMU microVM execution strategy (exec-sandbox) as Docker alternative for CodeInterpreterTool [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

PR fix notes

PR #4736: feat: add microvm code execution mode via exec-sandbox (#4702)

Description (problem / solution / changelog)

Summary

Linked Issue

Changed files

Code Example

Problem

Proposal

How integration could work

Option A: As a custom CrewAI Tool (works today, no core changes needed)

Option B: As a native CodeInterpreterTool strategy (requires core changes)

Why not just use Docker?

About exec-sandbox

extent analysis

Fix Plan

Example Code

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING