claude-code - ✅(Solved) Fix Silent 429 on context-1m-2025-08-07 beta after model switch — no rate-limit headers, no retry-after [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#50830Fetched 2026-04-20 12:11:52
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×4commented ×1cross-referenced ×1

Switching from claude-sonnet-4-6 to claude-opus-4-7 with the context-1m-2025-08-07 beta (/model claude-opus-4-7[1m]) in a session with an existing Sonnet prompt cache causes the first request to trigger a persistent HTTP 429 that:

  1. Returns no anthropic-ratelimit-* headers
  2. Returns no retry-after header
  3. Body is only {"type":"error","error":{"type":"rate_limit_error","message":"Error"}}
  4. Persists exactly ~71 minutes (matches ephemeral_1h TTL), regardless of client behavior
  5. Blocks Opus/Sonnet; Haiku keeps working

Error Message

  1. Body is only {"type":"error","error":{"type":"rate_limit_error","message":"Error"}} HTTP 429, headers: {} (empty), body: {"type":"error","error":{"type":"rate_limit_error","message":"Error"},"request_id":"req_011CaDT7QKiJjeuzkvXd2JNG"} → 209,698 tokens of fresh ephemeral_1h cache creation because the Sonnet cache cannot be read by Opus. 60 consecutive 429s over 71 min, all empty headers, all generic "Error". Recovery auto-releases exactly at T+71min (ephemeral_1h TTL).
  • Undebuggable (generic "Error")
  • Specific error message: "1h prompt cache quota exceeded, retry in N seconds"
  • On /model switch mid-session, either warn about cache-miss cost or auto-downgrade first request's ephemeral_1h to ephemeral_5m

Root Cause

→ 209,698 tokens of fresh ephemeral_1h cache creation because the Sonnet cache cannot be read by Opus. 60 consecutive 429s over 71 min, all empty headers, all generic "Error". Recovery auto-releases exactly at T+71min (ephemeral_1h TTL).

Fix Action

Fixed

PR fix notes

PR #74: receipt: add PT-2026-04-19 — headerless 429 from hidden enforcer blocks correct retry/failover policy (claude-code#50830)

Description (problem / solution / changelog)

Summary

Adds a new Pitstop Truth receipt capturing a failure mode where a rate-limit response (HTTP 429) is emitted without any recovery metadata, despite the server enforcing a deterministic lockout window.

Source: https://github.com/anthropics/claude-code/issues/50830


What happened

A model switch (Sonnet → Opus) in a session with an existing prompt cache triggers a large cross-model cache miss, forcing ~200k tokens of fresh ephemeral_1h cache creation.

The provider responds with:

  • HTTP 429
  • no Retry-After
  • no anthropic-ratelimit-* headers
  • generic error body: "rate_limit_error": "Error"

The 429 persists for ~71 minutes and then clears automatically — matching the ephemeral_1h TTL.


Why this matters

This is not just a poor error message. It is a decision-layer failure.

The system knows:

  • the constraint (ephemeral cache capacity)
  • the recovery window (~1 hour)

But it does not expose that information to the client.

As a result:

  • Retry logic cannot determine WAIT vs STOP
  • Failover logic cannot engage
  • Users see a generic error instead of a bounded delay
  • The system behaves as if the failure is unpredictable, when it is actually deterministic

Core invariant

A 429 without recovery metadata is not just less helpful — it prevents correct execution policy.


Pattern captured

Headerless 429 from hidden enforcement layer

  • Enforcement exists
  • Recovery window exists
  • Metadata is suppressed
  • Downstream systems are forced into blind behavior

This is a stronger form of prior 429 patterns:

  • not incorrect Retry-After
  • not ignored Retry-After
  • but missing recovery signal entirely

Why this belongs in Pitstop Truth

This receipt reinforces the broader thesis:

Systems fail when the component enforcing the constraint does not provide the information required for the decision layer to act correctly.

Here:

  • enforcement = ephemeral cache quota layer
  • decision layer = retry / routing / UX
  • failure = signal not propagated

Implications

Systems encountering this pattern must:

  • treat headerless 429s as ambiguous WAIT/STOP states
  • avoid blind retry loops
  • consider fallback / alternate routing paths
  • surface degraded state explicitly to users

Providers must:

  • propagate recovery metadata (e.g. Retry-After, quota context)
  • avoid emitting generic 429s from hidden enforcement layers

Files added

  • receipts/2026/04/PT-2026-04-19-github-claude-code-50830-headerless-429-hidden-enforcer/receipt.json
  • index.json updated

Validation

  • validate_receipts.py passes
  • JSON schema valid
  • indexed and searchable

Closing thought

This is a clean example of a system that knows the answer but refuses to say it.

Pitstop exists to make that answer explicit.

Changed files

  • index.json (modified, +22/-1)
  • receipts/2026/04/PT-2026-04-19-github-claude-code-50830-headerless-429-hidden-enforcer/receipt.json (added, +93/-0)

Code Example

{"model":"claude-sonnet-4-6","usage":{"cache_read_input_tokens":106561,"ephemeral_1h_input_tokens":258}}

---

HTTP 429, headers: {} (empty), body: {"type":"error","error":{"type":"rate_limit_error","message":"Error"},"request_id":"req_011CaDT7QKiJjeuzkvXd2JNG"}

---

{"model":"claude-opus-4-7","usage":{"cache_creation_input_tokens":209698,"cache_read_input_tokens":0,"ephemeral_1h_input_tokens":209698}}
RAW_BUFFERClick to expand / collapse

Summary

Switching from claude-sonnet-4-6 to claude-opus-4-7 with the context-1m-2025-08-07 beta (/model claude-opus-4-7[1m]) in a session with an existing Sonnet prompt cache causes the first request to trigger a persistent HTTP 429 that:

  1. Returns no anthropic-ratelimit-* headers
  2. Returns no retry-after header
  3. Body is only {"type":"error","error":{"type":"rate_limit_error","message":"Error"}}
  4. Persists exactly ~71 minutes (matches ephemeral_1h TTL), regardless of client behavior
  5. Blocks Opus/Sonnet; Haiku keeps working

Environment

  • Claude Code CLI 2.1.114
  • Auth: OAuth (Claude Max)
  • Date: 2026-04-19, 15:26–16:37 UTC

Reproduction

  1. Session with claude-sonnet-4-6, ~100k tokens of prompt cache built up
  2. /model claude-opus-4-7[1m]
  3. Next prompt → 429

Evidence

Last successful Sonnet request (15:23:57 UTC):

{"model":"claude-sonnet-4-6","usage":{"cache_read_input_tokens":106561,"ephemeral_1h_input_tokens":258}}

Model switch 15:24:16 UTC → first 429 at 15:26:00 UTC:

HTTP 429, headers: {} (empty), body: {"type":"error","error":{"type":"rate_limit_error","message":"Error"},"request_id":"req_011CaDT7QKiJjeuzkvXd2JNG"}

First successful Opus request (15:27:26 UTC) reveals the trigger:

{"model":"claude-opus-4-7","usage":{"cache_creation_input_tokens":209698,"cache_read_input_tokens":0,"ephemeral_1h_input_tokens":209698}}

→ 209,698 tokens of fresh ephemeral_1h cache creation because the Sonnet cache cannot be read by Opus. 60 consecutive 429s over 71 min, all empty headers, all generic "Error". Recovery auto-releases exactly at T+71min (ephemeral_1h TTL).

Request IDs (first/last): req_011CaDT7QKiJjeuzkvXd2JNG / req_011CaDTWPsKMpq3gpyGcSS5o

Hypothesis

The context-1m-2025-08-07 beta has its own enforcement layer that omits standard rate-limit metadata and releases only on ephemeral_1h slot expiry, not on volume drop.

Impact

  • Undebuggable (generic "Error")
  • Breaks SDK retry logic (no retry-after)
  • 70+ min lockout from one oversized request
  • Looks like a partial outage to users (Haiku works, Opus/Sonnet don't)

Suggested fixes

API:

  • Include anthropic-ratelimit-* + retry-after on beta-enforcer 429s
  • Specific error message: "1h prompt cache quota exceeded, retry in N seconds"

Client (Claude Code):

  • On /model switch mid-session, either warn about cache-miss cost or auto-downgrade first request's ephemeral_1h to ephemeral_5m

extent analysis

TL;DR

Switching to claude-opus-4-7 with an existing Sonnet prompt cache causes a persistent HTTP 429 due to the context-1m-2025-08-07 beta's enforcement layer, which can be mitigated by including rate-limit metadata in the error response or adjusting the client to handle cache misses.

Guidance

  • The issue is likely caused by the context-1m-2025-08-07 beta's enforcement layer, which omits standard rate-limit metadata and releases only on ephemeral_1h slot expiry.
  • To verify, check the response headers for the absence of anthropic-ratelimit-* and retry-after headers, and confirm that the error persists for exactly ~71 minutes.
  • To mitigate, consider including anthropic-ratelimit-* and retry-after headers in the error response, or adjust the client to warn about cache-miss costs or auto-downgrade the first request's ephemeral_1h to ephemeral_5m when switching models mid-session.
  • Monitor the ephemeral_1h_input_tokens usage to anticipate and prevent cache quota exceeded errors.

Example

No code snippet is provided as the issue is related to the API and client behavior, and not a specific code implementation.

Notes

The provided hypothesis suggests that the context-1m-2025-08-07 beta has its own enforcement layer, which may not be compatible with the existing rate-limiting logic. The suggested fixes aim to improve the error response and client behavior to handle this scenario.

Recommendation

Apply a workaround by adjusting the client to handle cache misses, such as warning about cache-miss costs or auto-downgrading the first request's ephemeral_1h to ephemeral_5m when switching models mid-session, to prevent the persistent HTTP 429 error.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING