llamaIndex - 💡(How to fix) Fix Add exec-sandbox as a CodeActAgent code executor (self-hosted QEMU microVMs) [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#20812Fetched 2026-04-08 00:30:47
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Proposing exec-sandbox as a sandboxed code_execute_fn for CodeActAgent. It runs code in ephemeral QEMU microVMs with hardware-level isolation (KVM on Linux, HVF on macOS) — self-hosted, no cloud account needed.

Related: #14049 (user question about sandboxed code execution)

Error Message

from exec_sandbox import Scheduler

async with Scheduler() as scheduler: session = await scheduler.session(language="python")

async def exec_sandbox_execute(code: str) -> str:
    result = await session.exec(code)
    if result.exit_code != 0:
        return f"Error:\n{result.stderr}"
    return result.stdout or "Code executed successfully (no output)"

agent = CodeActAgent(
    code_execute_fn=exec_sandbox_execute,
    llm=llm,
    tools=[...],
)

Root Cause

Proposing exec-sandbox as a sandboxed code_execute_fn for CodeActAgent. It runs code in ephemeral QEMU microVMs with hardware-level isolation (KVM on Linux, HVF on macOS) — self-hosted, no cloud account needed.

Related: #14049 (user question about sandboxed code execution)

Code Example

from exec_sandbox import Scheduler

async with Scheduler() as scheduler:
    session = await scheduler.session(language="python")

    async def exec_sandbox_execute(code: str) -> str:
        result = await session.exec(code)
        if result.exit_code != 0:
            return f"Error:\n{result.stderr}"
        return result.stdout or "Code executed successfully (no output)"

    agent = CodeActAgent(
        code_execute_fn=exec_sandbox_execute,
        llm=llm,
        tools=[...],
    )
RAW_BUFFERClick to expand / collapse

Add exec-sandbox as a CodeActAgent code executor (self-hosted QEMU microVMs)

Summary

Proposing exec-sandbox as a sandboxed code_execute_fn for CodeActAgent. It runs code in ephemeral QEMU microVMs with hardware-level isolation (KVM on Linux, HVF on macOS) — self-hosted, no cloud account needed.

Related: #14049 (user question about sandboxed code execution)

Why

  • Sandboxed — The default SimpleCodeExecutor example uses bare exec(), which is unsafe. exec-sandbox runs code in isolated VMs. Fills the gap discussed in #14049.
  • Self-hosted — AzureCodeInterpreterToolSpec requires Azure, AgentCoreCodeInterpreterToolSpec requires AWS. exec-sandbox runs on bare metal with just QEMU. Data never leaves the machine.
  • macOS + Linux — HVF on macOS, KVM on Linux. No Docker dependency.
  • Fast — ~1-2ms from warm pool, ~200ms from memory snapshots.
  • Apache-2.0

How it maps

exec-sandbox is a Python library with an async API. CodeActAgent accepts any code_execute_fn: Callable | Awaitable:

from exec_sandbox import Scheduler

async with Scheduler() as scheduler:
    session = await scheduler.session(language="python")

    async def exec_sandbox_execute(code: str) -> str:
        result = await session.exec(code)
        if result.exit_code != 0:
            return f"Error:\n{result.stderr}"
        return result.stdout or "Code executed successfully (no output)"

    agent = CodeActAgent(
        code_execute_fn=exec_sandbox_execute,
        llm=llm,
        tools=[...],
    )

Could also ship as a llama-index-tools-exec-sandbox integration package with a helper class managing the lifecycle.

Open questions

  • Lifecycle management — The sketch calls scheduler.__aenter__() directly. In practice, the Scheduler needs proper cleanup (__aexit__). Should the integration manage this per-agent, or use a singleton Scheduler?
  • Packaging — Standalone llama-index-tools-exec-sandbox package vs contributed example?

Links

Happy to submit a PR if there's interest.

extent analysis

Problem Summary

Add exec-sandbox as a CodeActAgent code executor

Root Cause Analysis

The default SimpleCodeExecutor example uses bare exec(), which is unsafe.

Fix Plan

Step 1: Install exec-sandbox

pip install exec-sandbox

Step 2: Import exec-sandbox in CodeActAgent

from exec_sandbox import Scheduler

Step 3: Create a scheduler instance

async with Scheduler() as scheduler:
    session = await scheduler.session(language="python")

Step 4: Define the exec_sandbox_execute function

async def exec_sandbox_execute(code: str) -> str:
    result = await session.exec(code)
    if result.exit_code != 0:
        return f"Error:\n{result.stderr}"
    return result.stdout or "Code executed successfully (no output)"

Step 5: Pass the exec_sandbox_execute function to CodeActAgent

agent = CodeActAgent(
    code_execute_fn=exec_sandbox_execute,
    llm=llm,
    tools=[...],
)

Verification

Test the CodeActAgent with the exec-sandbox executor by executing code and verifying the output.

Extra Tips

  • Consider implementing lifecycle management for the Scheduler instance.
  • Package exec-sandbox as a separate integration package for easier management.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

llamaIndex - 💡(How to fix) Fix Add exec-sandbox as a CodeActAgent code executor (self-hosted QEMU microVMs) [1 participants]