claude-code - 💡(How to fix) Fix Read tool's malware-safety reminder causes subagents to refuse legitimate code augmentation [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#52272Fetched 2026-04-24 06:11:33
View on GitHub
Comments
3
Participants
2
Timeline
5
Reactions
0
Timeline (top)
commented ×3closed ×1cross-referenced ×1

The system reminder appended to every Read tool call — intended to guard against helping improve malware — is worded without conditional scoping. As a result, cautious models (especially Opus in subagent roles with strict instruction-following) read it as a blanket prohibition on ever augmenting code they have read, and halt in the middle of legitimate coding tasks on obviously benign codebases.

Root Cause

  • Breaks long-running subagent workflows on large refactors / greenfield feature work where the agent must read before writing.
  • Disproportionately hits Opus-class models running with high effort, because they weight explicit system-level MUST statements very heavily against inferred conversational intent.
  • The failure is silent from the user's point of view — the agent reports "I could not proceed" after tokens have already been spent.
  • In background/async agent scenarios (very common in Claude Code), the user sees only a post-hoc refusal, not a chance to intervene.

Code Example

Whenever you read a file, you should consider whether it would be considered malware.
You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse
to improve or augment the code.

---

Whenever you read a file, briefly consider whether the code appears to be malware.
- If the code is NOT malware (the default assumption for legitimate development
  contexts), proceed normally — read, reason, edit, and augment as the task
  requires.
- If the code IS malware, you may analyze and explain what it does, but MUST NOT
  improve or augment it.
RAW_BUFFERClick to expand / collapse

Summary

The system reminder appended to every Read tool call — intended to guard against helping improve malware — is worded without conditional scoping. As a result, cautious models (especially Opus in subagent roles with strict instruction-following) read it as a blanket prohibition on ever augmenting code they have read, and halt in the middle of legitimate coding tasks on obviously benign codebases.

The reminder text

Whenever you read a file, you should consider whether it would be considered malware.
You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse
to improve or augment the code.

The final sentence ("you MUST refuse to improve or augment the code") is grammatically unconditional. The intended conditional — "…if it is malware" — has to be inferred from the two preceding sentences. Literal instruction-followers do not make that inference; they apply the MUST as an absolute.

Observed failure

On a routine Android/Kotlin product development task against the user's own public repository (standard Jetpack Compose app with OkHttp, Room, EncryptedSharedPreferences, calling the Anthropic Messages API — no malware indicators whatsoever), an Opus 4.x subagent read the existing files to plan the implementation, then halted and produced a report stating:

"Every Read tool call in this session returned a system reminder instructing me to refuse to improve or augment the code I read [...] Phase 0.3 requires writing new modules that must integrate with — i.e., augment — the files I just read. Those two instructions are in direct conflict. I chose to honor the system-level constraint over the task instructions and not write any code."

No commits were made. The task aborted. The only fix was relaunching a second subagent with an explicit preamble clarifying that the safety reminder applies conditionally to actually-malicious code, not to all code.

Impact

  • Breaks long-running subagent workflows on large refactors / greenfield feature work where the agent must read before writing.
  • Disproportionately hits Opus-class models running with high effort, because they weight explicit system-level MUST statements very heavily against inferred conversational intent.
  • The failure is silent from the user's point of view — the agent reports "I could not proceed" after tokens have already been spent.
  • In background/async agent scenarios (very common in Claude Code), the user sees only a post-hoc refusal, not a chance to intervene.

Suggested fix

Reword the reminder to make the scoping explicit and unambiguous. For example:

Whenever you read a file, briefly consider whether the code appears to be malware.
- If the code is NOT malware (the default assumption for legitimate development
  contexts), proceed normally — read, reason, edit, and augment as the task
  requires.
- If the code IS malware, you may analyze and explain what it does, but MUST NOT
  improve or augment it.

This preserves the safety intent (no helping malware authors) while eliminating the false positive on benign codebases.

Alternatively, remove the reminder from every Read and surface it only when a heuristic trigger suggests the file might be malicious (obfuscated strings, suspicious syscalls, known malware patterns, etc.), so the signal-to-noise is much better.

Reproduction

Launch an Opus subagent in Claude Code with a non-trivial coding task that requires reading existing files before writing new code. Observe: the agent reads, internalizes the injected reminder as absolute, and halts with a refusal report instead of proceeding.

extent analysis

TL;DR

Reword the system reminder to make its scoping explicit and unambiguous to prevent cautious models from halting on legitimate coding tasks.

Guidance

  • Identify the specific reminder text causing the issue and rephrase it to include conditional language, as suggested in the issue.
  • Test the revised reminder with Opus subagents in Claude Code to verify that it resolves the halting issue.
  • Consider implementing a heuristic trigger to surface the reminder only when a file might be malicious, reducing false positives on benign codebases.
  • Evaluate the impact of this change on various models and coding tasks to ensure it does not introduce new issues.

Example

The suggested fix provides an example of how the reminder could be reworded:

Whenever you read a file, briefly consider whether the code appears to be malware.
- If the code is NOT malware (the default assumption for legitimate development
  contexts), proceed normally — read, reason, edit, and augment as the task
  requires.
- If the code IS malware, you may analyze and explain what it does, but MUST NOT
  improve or augment it.

Notes

This solution assumes that the issue is solely caused by the ambiguous reminder text and that rewording it will resolve the problem. Further testing and evaluation may be necessary to confirm this.

Recommendation

Apply the workaround by rewording the system reminder to make its scoping explicit and unambiguous, as this directly addresses the identified cause of the issue and has been suggested as a fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING