claude-code - 💡(How to fix) Fix Usage cap: single ~30k-token prompt (no subagents) burned 38% of 5h Max 20x limit on Opus 4.7

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

A single Claude Code prompt on Opus 4.7 consumed ~38% of my 5-hour Max 20x usage cap in under 20 minutes. The prompt produced a written plan of roughly 30,000 tokens, with no subagents dispatched and no parallel tool fan-out.

Root Cause

A single Claude Code prompt on Opus 4.7 consumed ~38% of my 5-hour Max 20x usage cap in under 20 minutes. The prompt produced a written plan of roughly 30,000 tokens, with no subagents dispatched and no parallel tool fan-out.

Fix Action

Fix / Workaround

A single Claude Code prompt on Opus 4.7 consumed ~38% of my 5-hour Max 20x usage cap in under 20 minutes. The prompt produced a written plan of roughly 30,000 tokens, with no subagents dispatched and no parallel tool fan-out.

~38% of the Max 20x cap gone on one linear prompt. No subagent dispatches, no Agent tool calls, no parallel work.

RAW_BUFFERClick to expand / collapse

Summary

A single Claude Code prompt on Opus 4.7 consumed ~38% of my 5-hour Max 20x usage cap in under 20 minutes. The prompt produced a written plan of roughly 30,000 tokens, with no subagents dispatched and no parallel tool fan-out.

Environment

  • Plan: Max 20x ($200/mo)
  • Model: Opus 4.7 (claude-opus-4-7, 1M context)
  • Client: Claude Code CLI
  • Date: 2026-05-25
  • Duration: under 20 minutes
  • Cap burn: ~38% of 5-hour rolling window

What I expected

30k tokens of output (~a medium-length design doc) consuming a roughly proportional slice of the cap; on the order of single-digit percent, not ~38%.

What happened

~38% of the Max 20x cap gone on one linear prompt. No subagent dispatches, no Agent tool calls, no parallel work.

What I can't see

There's no user-facing breakdown of where the budget went. Possible hidden consumers:

  • Extended thinking tokens at a different multiplier
  • Cached read tokens being re-billed
  • Tool-result echoes counted against the cap
  • Something else entirely

Happy to share my Claude Code session ID privately via support if it helps reproduce.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING