claude-code - 💡(How to fix) Fix Massive token burn: 80% → 11% in ~5 minutes on a single session

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

A single Claude Code session consumed context from ~80% to ~11% of the session budget in roughly 5 minutes, despite no individual tool call returning a large payload. The burn appears to come from turn volume + repeated cache-miss re-sends, not a runaway tool result.

Error Message

Session transcript analysis (~/.claude/projects/.../8695a41b-*.jsonl):

Root Cause

A single Claude Code session consumed context from ~80% to ~11% of the session budget in roughly 5 minutes, despite no individual tool call returning a large payload. The burn appears to come from turn volume + repeated cache-miss re-sends, not a runaway tool result.

RAW_BUFFERClick to expand / collapse

Summary

A single Claude Code session consumed context from ~80% to ~11% of the session budget in roughly 5 minutes, despite no individual tool call returning a large payload. The burn appears to come from turn volume + repeated cache-miss re-sends, not a runaway tool result.

Environment

  • Claude Code CLI, macOS (Darwin 25.4.0)
  • Model: Claude Opus 4.7 (1M context)
  • Date: 2026-04-20
  • Session ID: 8695a41b-71c2-4a0e-b1bf-0ea6a0972c67
  • Account email: [email protected]

Observed behavior

Session transcript analysis (~/.claude/projects/.../8695a41b-*.jsonl):

  • 760 assistant turns / 556 user turns in one session
  • Tool-use counts: 225 Bash, 71 Edit, 55 Read, 49 TaskUpdate, 32 TaskCreate, 27 Grep, 22 WebFetch, 17 Write, 5 Agent
  • Largest single tool_use payloads: ExitPlanMode 25 KB, Write 24 KB
  • Tool results stayed small (avg 769 chars, max 25 KB, total ~400 KB)
  • No single "giant blob" — but every new turn re-sent the growing transcript

Expected behavior

Token consumption should be roughly proportional to new content per turn. Cache-miss amplification on a >1 MB transcript across 225 rapid Bash iterations should not drain ~70% of a session budget in 5 minutes.

Hypothesis

Once a session crosses a few hundred turns, cache-miss re-sends of the full transcript dominate token cost. There does not appear to be an in-CLI warning or automatic compaction threshold that fires before the user notices the drain.

Asks

  1. Can the CLI surface a proactive warning (or auto-compact suggestion) when per-turn token cost crosses a threshold?
  2. Is there a known issue with cache invalidation on long sessions that would cause repeated full re-sends?
  3. Billing clarity: for sessions that burn through context this fast, is there a path to review/refund?

Happy to share the .jsonl transcript privately if useful.

extent analysis

TL;DR

Implementing a proactive warning or auto-compact suggestion in the CLI when per-turn token cost crosses a threshold may help mitigate the issue of rapid context budget consumption.

Guidance

  • Investigate the cache invalidation mechanism to determine if it's causing repeated full re-sends of the transcript, leading to excessive token consumption.
  • Review the CLI's current threshold for warning users about high token costs and consider lowering it to prevent unexpected budget drains.
  • Analyze the transcript data to identify patterns or correlations between tool usage, turn volume, and token consumption that could inform optimization strategies.
  • Consider implementing a compaction or summarization mechanism for long transcripts to reduce the payload size and mitigate cache-miss re-sends.

Example

No code snippet is provided as the issue does not imply a specific code-related solution.

Notes

The issue seems to be related to the interaction between the CLI, cache invalidation, and token consumption. Without further information about the underlying implementation, it's challenging to provide a definitive solution. However, the suggested guidance points should help in identifying and mitigating the issue.

Recommendation

Apply workaround: Implement a proactive warning or auto-compact suggestion in the CLI to alert users when per-turn token cost crosses a threshold, allowing them to take corrective action and prevent unexpected budget drains. This approach addresses the immediate issue and provides a stopgap measure until a more comprehensive solution can be developed.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Token consumption should be roughly proportional to new content per turn. Cache-miss amplification on a >1 MB transcript across 225 rapid Bash iterations should not drain ~70% of a session budget in 5 minutes.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Massive token burn: 80% → 11% in ~5 minutes on a single session