claude-code - 💡(How to fix) Fix [BUG] Per-prompt cap on claude-opus-4-7 silently dropped ~2x around Apr 27, 2026 [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#54127Fetched 2026-04-28 06:38:31
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
0
Author
Timeline (top)
labeled ×6commented ×1cross-referenced ×1

Error Message

API response (synthetic assistant message in the session JSONL with isApiErrorMessage: true):

"Prompt is too long"                                                                                                
error: "invalid_request"                                                 
model: <synthetic>                                                                                                  
stop_reason: "stop_sequence"

CC UI: red "Context limit reached" banner.

Evidence across 7 consecutive sessions, same account, same model
(claude-opus-4-7), CC v2.1.107 → v2.1.121:

Session IDDateMax cache_read_input_tokens"Prompt is too long" errors
f4955755Apr 2387k0
0ec73e4bApr 23689k0
6173cc94Apr 24692k0
346c895bApr 25636k0
6a7ed1d9Apr 26752k0
e696482bApr 27441k11
1d546a39Apr 28363k1

JSONL transcripts available on request. Pulled from
~/.claude/projects/<slug>/<session-id>.jsonl using:

grep -oE '"cache_read_input_tokens":[0-9]+' <file> | sort -t: -k2 -rn | head                                        
grep -c '"isApiErrorMessage":true' <file>

Last rejected call this morning: cache_read_input_tokens=363036, the prompt
was rejected with "Prompt is too long".

Code Example

API response (synthetic assistant message in the session JSONL with
  isApiErrorMessage: true):                                                                                             
  
    "Prompt is too long"                                                                                                
    error: "invalid_request"                                                 
    model: <synthetic>                                                                                                  
    stop_reason: "stop_sequence"                                                                                        
                                         
  CC UI: red "Context limit reached" banner.                                                                            
                                                                             
 Evidence across 7 consecutive sessions, same account, same model                                                      
  (claude-opus-4-7), CC v2.1.107 → v2.1.121:                                          
                                                                                                                        
  | Session ID | Date   | Max cache_read_input_tokens | "Prompt is too long" errors |                                   
  |------------|--------|------------------------------|------------------------------|                                 
  | f4955755   | Apr 23 |  87k                         |  0                           |                                 
  | 0ec73e4b   | Apr 23 | 689k                         |  0                           |                                 
  | 6173cc94   | Apr 24 | 692k                         |  0                           |                                 
  | 346c895b   | Apr 25 | 636k                         |  0                           |                                 
  | 6a7ed1d9   | Apr 26 | 752k                         |  0                           |                                 
  | e696482b   | Apr 27 | 441k                         | 11                           |                                 
  | 1d546a39   | Apr 28 | 363k                         |  1                           |                                 
                                                                                                                        
  JSONL transcripts available on request. Pulled from                                                                   
  ~/.claude/projects/<slug>/<session-id>.jsonl using:                                                                   
                                                                                                                        
    grep -oE '"cache_read_input_tokens":[0-9]+' <file> | sort -t: -k2 -rn | head                                        
    grep -c '"isApiErrorMessage":true' <file>                                                                           
                                                                                                                        
  Last rejected call this morning: cache_read_input_tokens=363036, the prompt                                           
  was rejected with "Prompt is too long".
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

The per-prompt size cap on claude-opus-4-7 appears to have silently dropped ~2x between Apr 26 and Apr 27, 2026. Sessions that previously held 600-750k
cached tokens with zero rejection are now getting rejected at 360-440k. Same
model, same account, no behavior change on my end.

Visible symptom: the red "Context limit reached" banner appears in Claude
Code at ~36-44% of the displayed 1M context_window. CC's context display
shows plenty of headroom; the API rejects anyway.

What Should Happen?

Either:

  • The per-prompt cap should remain where it was — Opus 4.7 used to clear
    600-750k tokens cleanly, and that's the implicit contract of the "1M context window" framing.

  • If the cap was intentionally lowered, the new limit should be documented
    as a separate number from context_window_size and exposed in the API
    response so clients can show users where the operational ceiling actually
    is. Today the user sees context_window.used_percentage: 38 right up until
    the wedge.

Error Messages/Logs

API response (synthetic assistant message in the session JSONL with
  isApiErrorMessage: true):                                                                                             
  
    "Prompt is too long"                                                                                                
    error: "invalid_request"                                                 
    model: <synthetic>                                                                                                  
    stop_reason: "stop_sequence"                                                                                        
                                         
  CC UI: red "Context limit reached" banner.                                                                            
                                                                             
 Evidence across 7 consecutive sessions, same account, same model                                                      
  (claude-opus-4-7), CC v2.1.107 → v2.1.121:                                          
                                                                                                                        
  | Session ID | Date   | Max cache_read_input_tokens | "Prompt is too long" errors |                                   
  |------------|--------|------------------------------|------------------------------|                                 
  | f4955755   | Apr 23 |  87k                         |  0                           |                                 
  | 0ec73e4b   | Apr 23 | 689k                         |  0                           |                                 
  | 6173cc94   | Apr 24 | 692k                         |  0                           |                                 
  | 346c895b   | Apr 25 | 636k                         |  0                           |                                 
  | 6a7ed1d9   | Apr 26 | 752k                         |  0                           |                                 
  | e696482b   | Apr 27 | 441k                         | 11                           |                                 
  | 1d546a39   | Apr 28 | 363k                         |  1                           |                                 
                                                                                                                        
  JSONL transcripts available on request. Pulled from                                                                   
  ~/.claude/projects/<slug>/<session-id>.jsonl using:                                                                   
                                                                                                                        
    grep -oE '"cache_read_input_tokens":[0-9]+' <file> | sort -t: -k2 -rn | head                                        
    grep -c '"isApiErrorMessage":true' <file>                                                                           
                                                                                                                        
  Last rejected call this morning: cache_read_input_tokens=363036, the prompt                                           
  was rejected with "Prompt is too long".

Steps to Reproduce

This is a state condition, not a deterministic code path. Best approximation:

  1. Open a Claude Code session on claude-opus-4-7.

  2. Use a project with a substantial CLAUDE.md (~17k tokens in my case) and
    normal auto-memory loading.

  3. Work normally for a session — file reads, bash greps, back-and-forth. Avoid manual /compact.

  4. As cache_read_input_tokens climbs past ~360-400k, the next API call is
    rejected with "Prompt is too long" even though
    context_window.used_percentage shows ~36-44%.

  5. Verify in the session JSONL:

    grep '"isApiErrorMessage":true' <session.jsonl>

    and cross-reference the cache_read_input_tokens on the immediately
    preceding successful turn.

The exact rejection point varies (363k, 382k, 441k observed across sessions)
so a deterministic repro may not be possible — but the cross-session ceiling drop is unambiguous: pre-Apr 27 sessions cleared 600-750k cleanly,
post-Apr 27 sessions reject at 360-440k. Manual /compact recovers
immediately.

Claude Model

Opus

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

2.1.121

Platform

Anthropic API

Operating System

macOS

Terminal/Shell

Terminal.app (macOS)

Additional Information

No response

extent analysis

TL;DR

The per-prompt size cap on claude-opus-4-7 has silently dropped, causing sessions to be rejected at a lower token count than before, and a workaround or documentation of the new limit is needed.

Guidance

  • Verify the current limit by checking the API response for the cache_read_input_tokens value when the "Prompt is too long" error occurs.
  • Compare the context_window.used_percentage value with the actual token count to identify any discrepancies.
  • Test the limit with different session sizes and token counts to determine if the issue is consistent.
  • Consider using the /compact command to recover from the rejection and continue working.

Example

No code snippet is provided as the issue is related to the Anthropic API and Claude Code behavior.

Notes

The exact cause of the limit drop is unknown, and it is unclear if this is a permanent change or a temporary issue. The workaround of using /compact may not be suitable for all use cases.

Recommendation

Apply workaround: Use the /compact command to recover from rejections and continue working, as this has been observed to immediately recover from the issue. This is recommended until the new limit is documented or the issue is resolved.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING