claude-code - 💡(How to fix) Fix [BUG] model anchors on dangerouslyDisableSandbox: true and keeps applying it to unrelated read-only commands [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#55116Fetched 2026-05-01 05:45:51
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×4commented ×1

Error Message

READONLY_NOARGS, READONLY_EXACT, COMMAND_ALLOWLIST in src/tools/BashTool/readOnlyValidation.ts, plus the git/gh/docker read-only subcommand tables), refuse with a clear tool-result error: "This command 3. Surface anchoring detection. If the model has set the bypass on N consecutive calls without the previous one's output showing a sandbox-related error, log a warning back as a tool-result preamble:

Error Messages/Logs

Root Cause

  • The model-behavior side ("don't anchor") is hard to fix model-side because anchoring is statistical, not logical. Memory-based user corrections persist some of this across sessions but not within
    them. Fix #1 above is the cheapest reliable mitigation.
    • Cross-reference: this is closely related to the broader question of how dangerouslyDisableSandbox should compose with autoAllowBashIfSandboxed. They feel like they should interact but currently the
      model's bypass silently overrides the user's silence preference.

Fix Action

Fix / Workaround

  • The model-behavior side ("don't anchor") is hard to fix model-side because anchoring is statistical, not logical. Memory-based user corrections persist some of this across sessions but not within
    them. Fix #1 above is the cheapest reliable mitigation.
    • Cross-reference: this is closely related to the broader question of how dangerouslyDisableSandbox should compose with autoAllowBashIfSandboxed. They feel like they should interact but currently the
      model's bypass silently overrides the user's silence preference.
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Summary

Once Claude legitimately uses dangerouslyDisableSandbox: true for a single command (e.g. a git operation that needs to write to ~/.git-credentials), it tends to keep setting the flag on subsequent,
unrelated, plainly read-only Bash calls (wc, tail, cat, python3 -c-wrapped inspection). Each of those triggers a permission prompt — including in sessions where the user has explicitly opted into the sandbox via "sandbox": { "enabled": true, "autoAllowBashIfSandboxed": true } precisely to avoid prompts.

End state: the user opts into sandboxing to silence prompts; the model bypasses the sandbox; the prompts return.

Reproduction

In the same session, in order:

  1. Legitimate use — model runs git fetch origin (sandbox blocks credential-store lock write), retries with dangerouslyDisableSandbox: true. Works.
  2. Anchored misuse — within the next several minutes, model runs: wc -l "$TMPDIR/l2_timings.csv" && tail -5 "$TMPDIR/l2_timings.csv" && tail -3 "$TMPDIR/l2_run.log"
  3. …with dangerouslyDisableSandbox: true set, despite none of wc/tail writing anywhere or hitting a non-allowed host. User is prompted; user is annoyed.
  4. After explicit user correction ("never bypass sandbox for reads"), the model retries without the flag — but the new command happens to embed python3 -c "..." inside a compound, which is correctly
    not auto-allowed, so the user is prompted again. Two layers of friction stacked.

Why this matters

The system prompt clearly says:

▎ You should always default to running commands within the sandbox. Do NOT attempt to set dangerouslyDisableSandbox: true unless the user explicitly asks to bypass sandbox or a specific command just
▎ failed and you see evidence of sandbox restrictions causing the failure.

This is the right policy — but model behavior drifts from it under anchoring pressure. Memory-based corrections from the user help next session, but the failure mode recurs intra-session whenever an
early command needs the bypass.

The combination "autoAllowBashIfSandboxed: true user setting + model-applied bypass" defeats the whole point of opting into sandbox mode.

Proposed harness-side fixes

These don't depend on the model getting better; they make it harder for the model to be wrong.

  1. Reject the bypass for self-evidently read-only commands. When dangerouslyDisableSandbox: true is set on a Bash invocation whose parse is known-read-only (covered by READONLY_COMMANDS,
    READONLY_NOARGS, READONLY_EXACT, COMMAND_ALLOWLIST in src/tools/BashTool/readOnlyValidation.ts, plus the git/gh/docker read-only subcommand tables), refuse with a clear tool-result error: "This command is read-only; sandbox bypass is unnecessary. Retrying without the flag should succeed." Forces a no-flag retry.
  2. One-shot bypass semantics. Even when granted, the bypass auto-resets after the call. The model has to re-justify it each time. Removes the anchoring substrate entirely.
  3. Surface anchoring detection. If the model has set the bypass on N consecutive calls without the previous one's output showing a sandbox-related error, log a warning back as a tool-result preamble:
    "Note: previous N calls used dangerouslyDisableSandbox; the last K of those produced no sandbox-related errors. Consider running without the bypass." Cheap, non-blocking, observable.
  4. autoAllowBashIfSandboxed honoured even with bypass-refused calls. A bypass-refused call's read-only retry should also benefit from the existing auto-allow path. Otherwise users who opted into
    sandbox mode for prompt-silence still get prompts on the model's read-only retries.

Environment

  • Claude Code (CLI), session-level
  • Project .claude/settings.json includes: { "sandbox": { "enabled": true,
    "autoAllowBashIfSandboxed": true }
    }
  • Linux, sandbox config restricts writes outside an allowlist that includes /tmp/claude-<uid>/ but not ~/.git-credentials or .git/config.

Notes

  • The model-behavior side ("don't anchor") is hard to fix model-side because anchoring is statistical, not logical. Memory-based user corrections persist some of this across sessions but not within
    them. Fix #1 above is the cheapest reliable mitigation.
  • Cross-reference: this is closely related to the broader question of how dangerouslyDisableSandbox should compose with autoAllowBashIfSandboxed. They feel like they should interact but currently the
    model's bypass silently overrides the user's silence preference.

What Should Happen?

It should be fixed.

Error Messages/Logs

Steps to Reproduce

See above

Claude Model

None

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

2.1.123

Platform

Anthropic API

Operating System

Ubuntu/Debian Linux

Terminal/Shell

Terminal.app (macOS)

Additional Information

No response

extent analysis

TL;DR

Reject the bypass for self-evidently read-only commands to prevent unnecessary sandbox bypasses.

Guidance

  • Implement a check to refuse the dangerouslyDisableSandbox: true flag for read-only commands, forcing a retry without the flag.
  • Consider introducing one-shot bypass semantics to auto-reset the bypass after each call, requiring the model to re-justify it each time.
  • Surface anchoring detection by logging a warning when the model sets the bypass on consecutive calls without sandbox-related errors.
  • Ensure autoAllowBashIfSandboxed is honored even with bypass-refused calls to prevent unnecessary prompts.

Example

No code snippet is provided as it is not explicitly supported by the issue, but the readOnlyValidation.ts file in src/tools/BashTool can be modified to include the proposed checks.

Notes

The issue is closely related to the interaction between dangerouslyDisableSandbox and autoAllowBashIfSandboxed. The proposed fixes aim to mitigate the model's anchoring behavior without relying on model-side changes.

Recommendation

Apply the proposed harness-side fixes, starting with rejecting the bypass for self-evidently read-only commands, to prevent unnecessary sandbox bypasses and reduce user prompts.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING