claude-code - 💡(How to fix) Fix [Bug] Prompt cache unexpectedly collapses mid-session on Opus 4.x [3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#54563Fetched 2026-04-30 06:42:13
View on GitHub
Comments
3
Participants
3
Timeline
10
Reactions
1
Timeline (top)
labeled ×5commented ×3cross-referenced ×1subscribed ×1

Fix Action

Fix / Workaround

What I built locally as a workaround

RAW_BUFFERClick to expand / collapse

Bug Description Cache bug #34629 — recurrence with measurement

Date: 2026-04-28
Plan: Max 20x
Severity: 50% of 5h Opus quota burned in ~30 min on what felt like normal interactive work across three sessions. Second occurrence today, similar shape this morning.

Summary

Anthropic stated this was fixed. It's not.

Cache cache_read collapses from a stable 130K–185K-token cached prefix down to 0 or ~8.6K tokens between consecutive turns, mid-session, with no observable change in prompt structure on my side. The next turn pays full cache_creation (~130K–185K tokens) to
rebuild what should still have been hot. This recurs multiple times in the same session, on Opus 4.x. I quantified this same pattern on 2026-04-01 against bug #34629 — today's transcripts match the same signature exactly.

Today's measurement

Two affected sessions caught in the same 60-min window:

  1. (Opus 4.7 1M)
  • 155 turns, 6 cache-collapse events
  • Healthy comparison (same model, more turns, more user work): 170 turns, $36.82, 0
    anomalies
  • Affected session: 155 turns, $66.77, 6 anomalies
  • Cost inflation attributable to bug: ~$30 (~80% over expected)

turn 80: cache_read 132,264 -> 0 (paid cw=137,368)
turn 84: cache_read 137,593 -> 8,661 (paid cw=144,967) turn 97: cache_read 147,609 -> 8,661 (paid cw=149,xxx)
turn 101: cache_read 153,628 -> 8,661 (paid cw=153,xxx)
turn 104: cache_read 155,148 -> 8,661 (paid cw=155,xxx)
turn 140: cache_read 185,554 -> 8,661 (paid cw=185,xxx)

Notice: 137,368 cw paid at turn 80 (cache_read=0), then a near-identical cache built
immediately, then 4 collapses at the same fallback floor of 8,661. The 8,661 is
suspiciously stable — looks like only the first cache breakpoint survives and everything
above it gets dropped.

  1. (Opus 4.7 1M)
  • 108 turns, 1 cache-collapse event
  • turn 28: cache_read 123,306 -> 0 (paid cw=129,058)
  • A second turn 22 seconds later paid an identical 129,058 cw with cache_read=0 again. Two full rebuilds of an identical 129K cache in 22 seconds.

What it does NOT correlate with:

  • Sessions are on the 1M-context tier (claude-opus-4-7[1m]), but stayed under the 200K threshold throughout — bug occurred outside the doubled-pricing zone, so it is not a 1M-tier-pricing artifact
  • Not idle expiration — drops happen turn-over-turn, sub-minute spacing in some cases
  • Not session length — punk-brain 28-turn session hit it; another 170-turn session did not
  • Not specific tool calls — drops appear after various Bash/Edit/Read sequences with no obvious common trigger
  • Not specific to one project / repo / hooks setup — affects multiple projects with different settings.json

This matches the prior characterization (3.7% session degradation rate, event-triggered, non-deterministic).

Cost shape

Across all sessions on the bad window (4 sessions, ~80 min), I burned ~14M Opus-equivalent tokens. By comparison the same kind of work on a clean window normally runs roughly half that for me. The two affected sessions account for the excess.

Reproduction notes

I cannot reproduce on demand. Detection signature is reliable:

  • cache_read value drops by >=50% turn-over-turn
  • previous cache_read was >= ~50K (so cache was substantively warm)
  • new cache_read is < ~20K (so collapse, not partial invalidation)
  • usually multiple drops within the same session

What I built locally as a workaround

A PostToolUse hook that scans the live transcript every 3 tool calls, applies the rule above, and emits a system-reminder telling the agent to stop and ask me before continuing. It also writes a save-state file with the last user message and last assistant action so I can resume cleanly after /clear. This catches the bug at the second drop event instead of letting it bleed for tens of turns.

This shouldn't have to exist. Asking that this be looked at again.

What would help most

  1. Confirmation whether bug #34629 is still tracked / has any open investigation.
  2. If there's a way to opt into instrumented telemetry that captures the cache-key hash that gets dropped, I'd happily run it for a week and submit logs.
  3. For Anthropic to stop randomly resetting the 7 day quota. Anthropic arbitrarily reset it earlier, again.

Reference data

Local tooling (built 2026-04-01 against earlier instances of this bug):

  • tools/cache-health.py — post-hoc per-session analysis
  • tools/cache-monitor.py — repo-wide token excess accounting
  • activity/logs/cache-health-log.jsonl — accumulated per-session metrics

extent analysis

TL;DR

The most likely fix or workaround for the cache bug is to implement a mechanism to detect and prevent cache collapses, such as the PostToolUse hook created by the user, until the root cause is identified and addressed.

Guidance

  • Investigate the cache collapse issue by analyzing the cache_read values and identifying patterns or triggers that lead to the collapse.
  • Consider implementing a detection mechanism, such as the one described in the reproduction notes, to catch cache collapses and prevent further issues.
  • Review the tools/cache-health.py and tools/cache-monitor.py scripts to understand how they analyze cache health and token excess, and see if they can be used to gather more information about the issue.
  • If possible, opt into instrumented telemetry to capture the cache-key hash that gets dropped, as suggested by the user, to help identify the root cause.

Example

No code example is provided as it is not explicitly supported by the issue, but the user's PostToolUse hook can be used as a reference for detecting cache collapses:

# PostToolUse hook to detect cache collapses
def detect_cache_collapse(cache_read_values):
    if len(cache_read_values) < 2:
        return False
    previous_cache_read = cache_read_values[-2]
    current_cache_read = cache_read_values[-1]
    if previous_cache_read >= 50000 and current_cache_read < 20000 and current_cache_read / previous_cache_read < 0.5:
        return True
    return False

Notes

The issue lacks information about the underlying cause of the cache collapse, and the provided workaround is a detection mechanism rather than a fix. Further investigation is needed to identify the root cause and develop a permanent solution.

Recommendation

Apply the workaround by implementing a detection mechanism, such as the PostToolUse hook, to catch cache collapses and prevent further issues, until the root cause is identified and addressed. This will help

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [Bug] Prompt cache unexpectedly collapses mid-session on Opus 4.x [3 comments, 3 participants]