claude-code - 💡(How to fix) Fix [Feature Request] Graceful model fallback on overloaded_error — stop stalling sessions during peak load [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#49013Fetched 2026-04-17 08:53:17
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
labeled ×4commented ×1renamed ×1

Error Message

{"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id":"req_011Ca71E6bFqZYFY6SF15Boz"}

Code Example

{"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id":"req_011Ca71E6bFqZYFY6SF15Boz"}
Retrying in 33 seconds… (attempt 8/10)

---

{
  "model": "claude-opus-4-6",
  "fallbackModel": "claude-sonnet-4-6",
  "fallbackAfterRetries": 2
}
RAW_BUFFERClick to expand / collapse

Problem

When the Anthropic API returns overloaded_error, Claude Code retries the same model with backoff (up to 10 attempts). During peak load this can stall a session for several minutes with no progress, even though other models (e.g. Sonnet/Haiku) may have capacity.

Example output seen during a session:

{"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id":"req_011Ca71E6bFqZYFY6SF15Boz"}
Retrying in 33 seconds… (attempt 8/10)

This is particularly painful on Opus, which appears more sensitive to overload events than Sonnet/Haiku.

Proposal

Add an opt-in settings.json flag that falls back to a configured alternate model after N overload retries. For example:

{
  "model": "claude-opus-4-6",
  "fallbackModel": "claude-sonnet-4-6",
  "fallbackAfterRetries": 2
}

Behaviour:

  • On overloaded_error, retry the primary model up to fallbackAfterRetries times
  • If still overloaded, switch to fallbackModel for the remainder of the current turn (or session, configurable)
  • Surface a clear one-line notice to the user when the fallback triggers

Why

  • Opus is disproportionately affected by overload during peak hours
  • Manual /model switching requires the user to be watching the terminal, which defeats long-running or background work
  • A Sonnet response now is almost always more useful than an Opus response in five minutes
  • Hooks currently have no event for API errors, so this cannot be scripted by users

Alternatives considered

  • /fast mode — helps sometimes but still same model, still overloadable
  • Manual /model — works but requires active attention
  • A hook event for API errors — more flexible but also more work for users; a built-in flag covers the common case

extent analysis

TL;DR

Implement a fallback model mechanism that switches to an alternate model after a specified number of overload retries to mitigate session stalls during peak load.

Guidance

  • Introduce a fallbackModel and fallbackAfterRetries setting in settings.json to allow users to configure an alternate model and the number of retries before switching.
  • Modify the retry logic to switch to the fallback model after the specified number of retries, and surface a clear notice to the user when the fallback triggers.
  • Consider making the fallback behavior configurable to switch for the remainder of the current turn or the entire session.
  • Test the fallback mechanism with different models, such as Opus and Sonnet, to ensure it effectively mitigates overload events.

Example

{
  "model": "claude-opus-4-6",
  "fallbackModel": "claude-sonnet-4-6",
  "fallbackAfterRetries": 2
}

Notes

The proposed solution assumes that the Anthropic API's overloaded_error response is the primary cause of session stalls. The effectiveness of the fallback mechanism may depend on the specific models and workload.

Recommendation

Apply the proposed workaround by introducing the fallbackModel and fallbackAfterRetries settings, as it provides a flexible and user-configurable solution to mitigate overload events and reduce session stalls.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [Feature Request] Graceful model fallback on overloaded_error — stop stalling sessions during peak load [1 comments, 2 participants]