autogen - 💡(How to fix) Fix Add lightweight OS-level sandboxing via sandlock for LocalCommandLineCodeExecutor [3 comments, 3 participants]

congwang-mk · 2026-03-27T21:59:06Z

[autogen] Problem The LocalCommandLineCodeExecutor writes LLM-generated code to disk and executes it via asyncio.create subprocess exec with full host privileg… ## Problem The `LocalCommandLineCodeExecutor` writes LLM-generated code to disk and executes it via `asyncio.create_subprocess_exec` with **full host privileges**. The only safeguard is a suppressible `UserWarning`. Attack scenario: indirect prompt injection causes the LLM to generate malicious code (e.g., exfiltrating SSH keys, installing backdoors). The code runs with full filesystem, network, and process access. The Docker executor provides real isolation but is opt-in and adds significant overhead. There is no lightweight, local sandboxing option between "completely unsandboxed" and "full Docker container." ## Proposal Add [sandlock](https://github.com/multikernel/sandlock) as a new code executor backend. Sandlock is a lightweight Linux process sandbox using Landlock, seccomp-bpf, and user namespaces. ### What sandlock provides | Layer | Protection | |-------|-----------| | **Landlock** | Filesystem path whitelisting (read-only / read-write), network domain + port restrictions | | **seccomp-bpf** | Syscall filtering at kernel level: blocks `ptrace`, `mount`, `unshare`, `kexec_load`, `bpf`, etc. | | **User namespaces** | Privilege escalation prevention without root | | **Resource limits** | Memory, process count, CPU, open file caps (no cgroups needed) | ### Why this fits AutoGen - **~20ms startup** vs ~200ms+ for Docker. Low enough to be a reasonable default. - **No root, no daemon.** Unlike Docker, just `pip install sandlock`. No Docker socket needed. - **Fits the executor interface.** AutoGen already has a clean `CodeExecutor` abstraction. A `SandlockCommandLineCodeExecutor` would implement the same interface as `LocalCommandLineCodeExecutor` but with kernel-enforced confinement. - **Aligns with #7230.** The proposed `ToolSafetyPolicy` interface (allow_network, allow_filesystem, max_memory) maps directly to sandlock's capabilities. - **Self-hosted.** No Azure subscription needed. Works in air-gapped environments. - **Linux primary.** Non-Linux users can continue using the Docker executor. ### Where it fits in the executor hierarchy ``` LocalCommandLineCodeExecutor → SandlockCodeExecutor → DockerCodeExecutor → AzureCodeExecutor (unsandboxed) (lightweight (container (cloud OS-level) isolation) isolation) ``` ### Example usage ```python from autogen_ext.code_executors.sandlock import SandlockCommandLineCodeExecutor executor = SandlockCommandLineCodeExecutor( work_dir="./coding", fs_read=["/usr/lib/python3", "/usr/local/lib"], fs_write=["./coding"], net_allow=[], # no network by default max_memory_mb=512, max_procs=5, ) ``` ### Relation to existing issues - #7462 (LocalCommandLineCodeExecutor unsandboxed): sandlock provides a drop-in replacement with kernel enforcement - #7230 (ToolSafetyPolicy proposal): sandlock's per-sandbox policy maps directly to the proposed interface - #7181 (path traversal in LocalCommandLineCodeExecutor): Landlock filesystem whitelisting prevents writes outside allowed paths at the kernel level - #7427 (MCP tool poisoning): sandboxed tool execution limits blast radius of malicious MCP tools ## Alternatives considered - **Docker-only**: adds complexity and overhead; not available in all environments (CI, minimal VMs, air-gapped) - **Hardening LocalCommandLineCodeExecutor with blocklists**: fundamentally limited; new bypasses will always emerge - **firejail**: requires root or setuid binary - **bubblewrap (bwrap)**: lower-level, no Python API, no resource limits without cgroups - **gVisor**: heavy, requires Docker or dedicated kernel ## References - [sandlock GitHub](https://github.com/multikernel/sandlock) - [Landlock LSM](https://landlock.io/) - [seccomp-bpf docs](https://www.kernel.org/doc/html/latest/userspace-api/seccomp_filter.html)

autogen2026-03-27 21:59:06

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

microsoft/autogen#7475•Fetched 2026-04-08 01:42:15

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3mentioned ×2subscribed ×2cross-referenced ×1

Code Example

LocalCommandLineCodeExecutor  →  SandlockCodeExecutor  →  DockerCodeExecutor  →  AzureCodeExecutor
      (unsandboxed)               (lightweight             (container            (cloud
                                   OS-level)                isolation)            isolation)

---

from autogen_ext.code_executors.sandlock import SandlockCommandLineCodeExecutor

executor = SandlockCommandLineCodeExecutor(
    work_dir="./coding",
    fs_read=["/usr/lib/python3", "/usr/local/lib"],
    fs_write=["./coding"],
    net_allow=[],  # no network by default
    max_memory_mb=512,
    max_procs=5,
)

RAW_BUFFERClick to expand / collapse

Problem

The LocalCommandLineCodeExecutor writes LLM-generated code to disk and executes it via asyncio.create_subprocess_exec with full host privileges. The only safeguard is a suppressible UserWarning.

Attack scenario: indirect prompt injection causes the LLM to generate malicious code (e.g., exfiltrating SSH keys, installing backdoors). The code runs with full filesystem, network, and process access.

The Docker executor provides real isolation but is opt-in and adds significant overhead. There is no lightweight, local sandboxing option between "completely unsandboxed" and "full Docker container."

Proposal

Add sandlock as a new code executor backend. Sandlock is a lightweight Linux process sandbox using Landlock, seccomp-bpf, and user namespaces.

What sandlock provides

Layer	Protection
Landlock	Filesystem path whitelisting (read-only / read-write), network domain + port restrictions
seccomp-bpf	Syscall filtering at kernel level: blocks `ptrace`, `mount`, `unshare`, `kexec_load`, `bpf`, etc.
User namespaces	Privilege escalation prevention without root
Resource limits	Memory, process count, CPU, open file caps (no cgroups needed)

Why this fits AutoGen

~20ms startup vs ~200ms+ for Docker. Low enough to be a reasonable default.
No root, no daemon. Unlike Docker, just pip install sandlock. No Docker socket needed.
Fits the executor interface. AutoGen already has a clean CodeExecutor abstraction. A SandlockCommandLineCodeExecutor would implement the same interface as LocalCommandLineCodeExecutor but with kernel-enforced confinement.
Aligns with #7230. The proposed ToolSafetyPolicy interface (allow_network, allow_filesystem, max_memory) maps directly to sandlock's capabilities.
Self-hosted. No Azure subscription needed. Works in air-gapped environments.
Linux primary. Non-Linux users can continue using the Docker executor.

Where it fits in the executor hierarchy

LocalCommandLineCodeExecutor  →  SandlockCodeExecutor  →  DockerCodeExecutor  →  AzureCodeExecutor
      (unsandboxed)               (lightweight             (container            (cloud
                                   OS-level)                isolation)            isolation)

Example usage

from autogen_ext.code_executors.sandlock import SandlockCommandLineCodeExecutor

executor = SandlockCommandLineCodeExecutor(
    work_dir="./coding",
    fs_read=["/usr/lib/python3", "/usr/local/lib"],
    fs_write=["./coding"],
    net_allow=[],  # no network by default
    max_memory_mb=512,
    max_procs=5,
)

Relation to existing issues

#7462 (LocalCommandLineCodeExecutor unsandboxed): sandlock provides a drop-in replacement with kernel enforcement
#7230 (ToolSafetyPolicy proposal): sandlock's per-sandbox policy maps directly to the proposed interface
#7181 (path traversal in LocalCommandLineCodeExecutor): Landlock filesystem whitelisting prevents writes outside allowed paths at the kernel level
#7427 (MCP tool poisoning): sandboxed tool execution limits blast radius of malicious MCP tools

Alternatives considered

Docker-only: adds complexity and overhead; not available in all environments (CI, minimal VMs, air-gapped)
Hardening LocalCommandLineCodeExecutor with blocklists: fundamentally limited; new bypasses will always emerge
firejail: requires root or setuid binary
bubblewrap (bwrap): lower-level, no Python API, no resource limits without cgroups
gVisor: heavy, requires Docker or dedicated kernel

References

extent analysis

Fix Plan

To address the issue of the LocalCommandLineCodeExecutor running with full host privileges, we will implement a new code executor backend using sandlock. This will provide a lightweight Linux process sandbox using Landlock, seccomp-bpf, and user namespaces.

Here are the steps to implement the fix:

Install sandlock using pip install sandlock
Create a new SandlockCommandLineCodeExecutor class that implements the CodeExecutor interface
Configure the SandlockCommandLineCodeExecutor with the desired settings, such as:
- work_dir: the working directory for the executor
- fs_read: a list of filesystem paths that are allowed to be read
- fs_write: a list of filesystem paths that are allowed to be written
- net_allow: a list of network domains and ports that are allowed to be accessed
- max_memory_mb: the maximum amount of memory that the executor is allowed to use
- max_procs: the maximum number of processes that the executor is allowed to spawn

Example code:

from autogen_ext.code_executors.sandlock import SandlockCommandLineCodeExecutor

executor = SandlockCommandLineCodeExecutor(
    work_dir="./coding",
    fs_read=["/usr/lib/python3", "/usr/local/lib"],
    fs_write=["./coding"],
    net_allow=[],  # no network by default
    max_memory_mb=512,
    max_procs=5,
)

Verification

To verify that the fix is working, you can test the SandlockCommandLineCodeExecutor with a sample code execution task. For example:

code = "print('Hello, World!')"
executor.execute_code(code)

This should execute the code in a sandboxed environment with the specified settings.

Extra Tips

Make sure to test the SandlockCommandLineCodeExecutor with different settings and scenarios to ensure that it is working as expected.
Consider adding additional logging and monitoring to the SandlockCommandLineCodeExecutor to detect and respond to any potential security issues.
Keep in mind that sandlock is a Linux-only solution, so you may need to use a different executor backend for non-Linux environments.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #configuration error #environment variable #network issue #logging issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

autogen - 💡(How to fix) Fix Add lightweight OS-level sandboxing via sandlock for LocalCommandLineCodeExecutor [3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

Problem

Proposal

What sandlock provides

Why this fits AutoGen

Where it fits in the executor hierarchy

Example usage

Relation to existing issues

Alternatives considered

References

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

autogen - 💡(How to fix) Fix Add lightweight OS-level sandboxing via sandlock for LocalCommandLineCodeExecutor [3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

Problem

Proposal

What sandlock provides

Why this fits AutoGen

Where it fits in the executor hierarchy

Example usage

Relation to existing issues

Alternatives considered

References

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING