claude-code - 💡(How to fix) Fix Intermittent tool-result corruption: stale/replayed output + sibling cancellation (v2.1.143)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

In long sessions the tool-execution layer intermittently returns stale / replayed output, results bound to the wrong call, and "Cancelled: parallel tool call X errored" on calls that did not themselves fail. Once it starts, in-harness verification (Read, Grep, build exit codes, git show) can no longer be trusted, which leads an agent to make and commit edits it cannot verify. The behavior is intermittent — it does not reproduce on a clean minimal case — so the exact trigger is not yet isolated.

Honesty note: some "facts" I first gathered while diagnosing were themselves corrupted results (a phantom 5.77 MB ~/.claude.json, a phantom version string, an invariant "41388 tokens" message). The HARD EVIDENCE below is separated from HYPOTHESIS for that reason.

Root Cause

  1. Sibling cancellation. A Write call returned "Cancelled: parallel tool call Bash(...) errored" — it was aborted because a different call in the same assistant turn exited non-zero (a find that returned exit 1). The Write itself was valid.

Fix Action

Fix / Workaround

Workarounds (until fixed)

RAW_BUFFERClick to expand / collapse
<!-- Created: 2026.05.31 23:24 | Agent: Claude Opus 4.8 (1M context) -->

Bug report: intermittent tool-result corruption (stale/replayed output + sibling cancellation)

Summary

In long sessions the tool-execution layer intermittently returns stale / replayed output, results bound to the wrong call, and "Cancelled: parallel tool call X errored" on calls that did not themselves fail. Once it starts, in-harness verification (Read, Grep, build exit codes, git show) can no longer be trusted, which leads an agent to make and commit edits it cannot verify. The behavior is intermittent — it does not reproduce on a clean minimal case — so the exact trigger is not yet isolated.

Honesty note: some "facts" I first gathered while diagnosing were themselves corrupted results (a phantom 5.77 MB ~/.claude.json, a phantom version string, an invariant "41388 tokens" message). The HARD EVIDENCE below is separated from HYPOTHESIS for that reason.

Environment

  • Claude Code version: 2.1.143 (@anthropic-ai/[email protected], confirmed via npm ls -g)
  • Surface: VS Code native extension (Claude Agent SDK runtime)
  • Model: Claude Opus 4.8 (1M context); OS: Linux 6.12 (Debian 13), bash
  • User reports symptoms began after the most recent update.

Hard evidence (clean, repeated within the session)

  1. Invariant "output limit" string contaminating unrelated results. A constant message — "...exceeded the max output limit of 30000 tokens. The result was over the limit by 41388 tokens" — was appended to several small, unrelated tool results:

    • appeared on a cat file | head -c 400 call (output physically capped at 400 bytes — cannot exceed any token limit), and on a sibling npm ls;
    • appeared glued to the real ~30-byte output of ls -la ~/.claude.json | awk '{print $5,$9}'. The 41388 figure is identical across three unrelated calls, which is only possible if it is a cached/replayed fragment, not a fresh measurement.
  2. Sibling cancellation. A Write call returned "Cancelled: parallel tool call Bash(...) errored" — it was aborted because a different call in the same assistant turn exited non-zero (a find that returned exit 1). The Write itself was valid.

  3. Phantom file-content corruption. A Read once showed a Svelte file's script block garbled (dead code after return, duplicated const declarations); a git show HEAD:<file> | grep claimed thousands of duplicate lines. But wc -l reported a stable 1404 lines for working tree, HEAD, and HEAD~1, and a later clean Read showed the file intact. So the "corruption" was largely in the RESULTS layer, not on disk.

  4. Replayed reads / phantom sentinels. Read with offset/limit sometimes returned a different offset's content or replayed an earlier read; "Wasted call — file unchanged" sentinels appeared for files not recently read.

What did NOT reproduce

  • A minimal 2-call batch — false (exit 1) + echo SIBLING_SURVIVED — ran cleanly: the erroring call did not cancel its sibling. So sibling cancellation is not a deterministic "any non-zero exit kills the batch" rule; it appears timing/concurrency-dependent.

Hypothesis (NOT proven — for the Claude Code team to verify)

The symptoms are consistent with a per-session tool-result buffer that is not strictly keyed by tool_use_id, combined with a race in how parallel calls are scheduled/cancelled and how the oversized-output replacement is applied:

  • one call's result (including the "output too large" replacement) bleeds into sibling slots;
  • a call that errors or a batch that collectively exceeds the output cap can cancel in-flight siblings;
  • the corrupted buffer state then persists and pollutes later calls.

Correlates with large parallel tool-call batches (10–14 calls/turn) and the presence of at least one very large result (full-file reads of >1000-line files, broad grep -rn). Likely introduced/exposed by a recent release (2.1.x) given the user's "started after the update" report.

Impact

High. Reads, greps, build output, and git show all become unreliable, so an agent cannot verify its own edits — risking garbled in-place edits and commits/pushes of unverifiable state. It defeats the standard "verify before done" safety gate.

Workarounds (until fixed)

  • Restart the session to clear the corrupted buffer (it does not self-recover).
  • Prefer small batches (1–3 tool calls/turn); avoid 10+ parallel calls.
  • Avoid huge tool outputs: bounded-range Reads, pipe grep through head, never cat large files.
  • Once the invariant <N> tokens string or a phantom "Wasted call" appears, treat ALL in-harness verification as suspect and re-verify in a real terminal or a fresh session.

How to file

  • Run /bug inside Claude Code to submit with session context, or open an issue at https://github.com/anthropics/claude-code/issues and paste this file. Include version 2.1.143 and the "Hard evidence" section above — especially the invariant 41388 tokens contamination, which is the clearest proof of result-buffer bleed-through.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING