claude-code - 💡(How to fix) Fix [Feature Request] Graceful model fallback on overloaded_error — stop stalling sessions during peak load [1 comments, 2 participants]

claude-code2026-04-16 06:53:39

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#49013•Fetched 2026-04-17 08:53:17

View on GitHub

Comments

Participants

Timeline

Reactions

Author

n0t-4m17h

Participants

github-actions[bot]

n0t-4m17h

Timeline (top)

labeled ×4commented ×1renamed ×1

Error Message

{"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id":"req_011Ca71E6bFqZYFY6SF15Boz"}

Code Example

{"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id":"req_011Ca71E6bFqZYFY6SF15Boz"}
Retrying in 33 seconds… (attempt 8/10)

---

{
  "model": "claude-opus-4-6",
  "fallbackModel": "claude-sonnet-4-6",
  "fallbackAfterRetries": 2
}

RAW_BUFFERClick to expand / collapse

Problem

When the Anthropic API returns overloaded_error, Claude Code retries the same model with backoff (up to 10 attempts). During peak load this can stall a session for several minutes with no progress, even though other models (e.g. Sonnet/Haiku) may have capacity.

Example output seen during a session:

{"type":"error","error":{"type":"overloaded_error","message":"Overloaded"},"request_id":"req_011Ca71E6bFqZYFY6SF15Boz"}
Retrying in 33 seconds… (attempt 8/10)

This is particularly painful on Opus, which appears more sensitive to overload events than Sonnet/Haiku.

Proposal

Add an opt-in settings.json flag that falls back to a configured alternate model after N overload retries. For example:

{
  "model": "claude-opus-4-6",
  "fallbackModel": "claude-sonnet-4-6",
  "fallbackAfterRetries": 2
}

Behaviour:

On overloaded_error, retry the primary model up to fallbackAfterRetries times
If still overloaded, switch to fallbackModel for the remainder of the current turn (or session, configurable)
Surface a clear one-line notice to the user when the fallback triggers

Why

Opus is disproportionately affected by overload during peak hours
Manual /model switching requires the user to be watching the terminal, which defeats long-running or background work
A Sonnet response now is almost always more useful than an Opus response in five minutes
Hooks currently have no event for API errors, so this cannot be scripted by users

Alternatives considered

/fast mode — helps sometimes but still same model, still overloadable
Manual /model — works but requires active attention
A hook event for API errors — more flexible but also more work for users; a built-in flag covers the common case

extent analysis

TL;DR

Implement a fallback model mechanism that switches to an alternate model after a specified number of overload retries to mitigate session stalls during peak load.

Guidance

Introduce a fallbackModel and fallbackAfterRetries setting in settings.json to allow users to configure an alternate model and the number of retries before switching.
Modify the retry logic to switch to the fallback model after the specified number of retries, and surface a clear notice to the user when the fallback triggers.
Consider making the fallback behavior configurable to switch for the remainder of the current turn or the entire session.
Test the fallback mechanism with different models, such as Opus and Sonnet, to ensure it effectively mitigates overload events.

Example

{
  "model": "claude-opus-4-6",
  "fallbackModel": "claude-sonnet-4-6",
  "fallbackAfterRetries": 2
}

Notes

The proposed solution assumes that the Anthropic API's overloaded_error response is the primary cause of session stalls. The effectiveness of the fallback mechanism may depend on the specific models and workload.

Recommendation

Apply the proposed workaround by introducing the fallbackModel and fallbackAfterRetries settings, as it provides a flexible and user-configurable solution to mitigate overload events and reduce session stalls.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #GPU compatibility #latency issue #model loading #dependency error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [Feature Request] Graceful model fallback on overloaded_error — stop stalling sessions during peak load [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Problem

Proposal

Why

Alternatives considered

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [Feature Request] Graceful model fallback on overloaded_error — stop stalling sessions during peak load [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Problem

Proposal

Why

Alternatives considered

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING