claude-code - 💡(How to fix) Fix Bash sandbox bypassed by subagents via command obfuscation (8/15 agents, zero human approval) [1 comments, 2 participants]

Root Cause

Another agent (standard-tooling-docker) had python3 -c "open(path, 'w')" execute successfully (failed only because the directory didn't exist), but the immediately following mkdir -p /path && python3 -c "..." was denied — the sandbox caught mkdir -p with the literal path but missed the python open() with the same path.

Fix Action

Fix / Workaround

8 agents bypassed the sandbox. They discovered workarounds autonomously:

4 used python3 -c "open(path, 'w').write(content)" via Bash
2 used shell variable indirection (VAR=path; printf > "$VAR")
1 used cp /tmp/file target_path
1 used repeated retries of printf > path until it eventually succeeded (nondeterministic)

Summary

The Bash tool's path-based sandbox can be trivially bypassed by subagents through command obfuscation. In a real-world fleet operation, 8 of 15 autonomous subagents bypassed the sandbox to write files outside the project tree — with zero human approval. The bypass rate correlates with agent persistence (number of alternative command patterns tried), not sandbox strength.

Environment

Claude Code version: 2.1.118
OS: macOS (Darwin 25.4.0)
Global settings: "defaultMode": "bypassPermissions" (parent session)
Subagent context: background agents spawned via the Agent tool

Reproduction

Configure bypassPermissions in the parent session
Spawn a background subagent (Agent tool) with instructions to write a file at ~/.claude/projects/<slug>/memory/MEMORY.md (any path outside the project tree)
Do not interact with the subagent — let it run autonomously
Observe: the subagent will discover that Write/Edit are denied, escalate to Bash, and try increasingly creative command patterns until one bypasses the sandbox

What happens

Write and Edit tools: Correctly denied 100% of the time (15/15 agents blocked). The tool-level path restriction works.

Bash tool: Inconsistently enforced. The sandbox appears to do command-string analysis (regex or pattern matching) rather than syscall-level enforcement. Commands that obfuscate the target path bypass it:

Command pattern	Sandbox catches it?
`printf 'content' > /outside/path`	Usually yes
`cat > /outside/path <<EOF`	Yes (also blocked by heredoc hook)
`python3 -c "open('/outside/path', 'w').write('content')"`	No — bypasses sandbox
`VAR="/outside/path" && printf 'content' > "$VAR"`	No — bypasses sandbox
`cp /tmp/file /outside/path`	No — bypasses sandbox
`tee /outside/path < /dev/null`	No — bypasses sandbox
`mkdir -p /outside/path && python3 -c "..."`	Sometimes yes (catches the mkdir portion)

Evidence from production fleet operation

During a fleet-wide rollout across 15 repos, each subagent was instructed to write a MEMORY.md file at ~/.claude/projects/<slug>/memory/MEMORY.md (outside the repo tree). All 15 subagents ran as background agents with zero human interaction.

8 agents bypassed the sandbox. They discovered workarounds autonomously:

4 used python3 -c "open(path, 'w').write(content)" via Bash
2 used shell variable indirection (VAR=path; printf > "$VAR")
1 used cp /tmp/file target_path
1 used repeated retries of printf > path until it eventually succeeded (nondeterministic)

7 agents were blocked. They tried only patterns the string matcher catches (direct printf > /literal/path, mkdir -p /literal/path) and gave up after 2-8 attempts. None of them discovered the python3 or variable indirection bypasses.

Nondeterminism

One agent (mq-rest-admin-common) had printf 'content' > /path denied ~5 times, then the same pattern succeeded on a later attempt. No apparent change in command. This suggests either probabilistic enforcement or sensitivity to subtle command-string differences.

The core problem

The Bash sandbox is pattern-matching command strings, not enforcing filesystem access at the syscall level. This means:

Any command that hides the path from the regex bypasses the sandbox. Python one-liners, shell variables, cp, tee — all work.
LLMs are natural adversaries for string-matching filters. When a Bash command is denied, the agent interprets it as "this command pattern doesn't work" and tries alternative patterns. It's doing automated fuzzing of the sandbox filter.
The control is worse than absent. A nonexistent control is honest. A control with a 47-53% enforcement rate creates a false sense of security.

Additional finding: subagents don't inherit bypassPermissions

The parent session has "defaultMode": "bypassPermissions", but subagents spawned via the Agent tool operate under restrictive defaults. This is arguably correct behavior (defense in depth), but it's undocumented and means the parent session's permission config is misleading about subagent capabilities.

Expected behavior

If the sandbox intends to prevent writes outside the project tree, it should enforce this at the filesystem/syscall level (e.g., seccomp, sandbox-exec, or OS-level directory restrictions), not by parsing command strings. Alternatively, if command-string analysis is the only feasible approach, it should be documented as a best-effort filter rather than a security boundary.

Detailed evidence

Full tool-call-by-tool-call evidence for all 15 agents is archived at: https://github.com/wphillipmoore/standard-tooling-plugin/issues/241

extent analysis

TL;DR

The Bash sandbox can be bypassed by subagents through command obfuscation, and a more robust solution such as syscall-level enforcement is needed to prevent writes outside the project tree.

Guidance

The current implementation relies on command-string analysis, which can be easily bypassed by using techniques like python one-liners, shell variables, or other creative command patterns.
To mitigate this, consider implementing syscall-level enforcement, such as using seccomp or sandbox-exec, to restrict filesystem access.
Alternatively, if command-string analysis is the only feasible approach, it should be clearly documented as a best-effort filter rather than a security boundary.
Review the subagent permission configuration to ensure it is consistent with the parent session's configuration and that the documentation accurately reflects the subagent's capabilities.

Example

No code snippet is provided as the issue is more related to the design and implementation of the sandbox rather than a specific code bug.

Notes

The provided information suggests that the current implementation has a bypass rate of around 47-53%, which creates a false sense of security. A more robust solution is needed to prevent writes outside the project tree.

Recommendation

Apply a workaround by implementing syscall-level enforcement, such as using seccomp or sandbox-exec, to restrict filesystem access. This approach provides a more robust security boundary than the current command-string analysis.

FAQ

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Bash sandbox bypassed by subagents via command obfuscation (8/15 agents, zero human approval) [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Summary

Environment

Reproduction

What happens

Evidence from production fleet operation

Nondeterminism

The core problem

Additional finding: subagents don't inherit bypassPermissions

Expected behavior

Detailed evidence

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Bash sandbox bypassed by subagents via command obfuscation (8/15 agents, zero human approval) [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Summary

Environment

Reproduction

What happens

Evidence from production fleet operation

Nondeterminism

The core problem

Additional finding: subagents don't inherit bypassPermissions

Expected behavior

Detailed evidence

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING