Shared-capacity throttles are transient and not caused by the user exceeding their plan. The client should: 1. Detect the shared-capacity error (distinct from quota exhaustion). 2. Retry with exponential backoff for a bounded window (e.g. up to 60s) before surfacing the failure. 3. Surface a visible "waiting on capacity" indicator during the retry window so users know it's recovering, not hung.

claude-code - 💡(How to fix) Fix Auto-retry with backoff on shared-capacity rate limits instead of failing the turn [2 comments, 3 participants]

claude-code2026-04-19 18:32:15

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#50841•Fetched 2026-04-20 12:11:33

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2labeled ×2

Error Message

Detect the shared-capacity error (distinct from quota exhaustion).

Root Cause

Users are effectively throttled twice: once by the API, once by losing the ability to continue a session they're paying for.
No reliable Retry-After header doesn't preclude client-side backoff — 250ms → 500ms → 1s → 2s → 4s capped at 60s would catch the vast majority of these transient throttles.
The "state goes stale" argument against retry doesn't hold for short windows — prompt cache TTL is 5 minutes.

RAW_BUFFERClick to expand / collapse

Problem

When the API returns Server is temporarily limiting requests (not your usage limit) · Rate limited, Claude Code terminates the turn. The user returns to a dead-end conversation with no progress and no automatic recovery, despite having usage quota remaining.

This is particularly disruptive during long autonomous runs (e.g. /loop, multi-agent orchestration, do-work skill) where losing a turn mid-task can lose significant state.

Expected behavior

Shared-capacity throttles are transient and not caused by the user exceeding their plan. The client should:

Detect the shared-capacity error (distinct from quota exhaustion).
Retry with exponential backoff for a bounded window (e.g. up to 60s) before surfacing the failure.
Surface a visible "waiting on capacity" indicator during the retry window so users know it's recovering, not hung.

Why this matters

Users are effectively throttled twice: once by the API, once by losing the ability to continue a session they're paying for.
No reliable Retry-After header doesn't preclude client-side backoff — 250ms → 500ms → 1s → 2s → 4s capped at 60s would catch the vast majority of these transient throttles.
The "state goes stale" argument against retry doesn't hold for short windows — prompt cache TTL is 5 minutes.

Suggested implementation notes

Only auto-retry for the shared-capacity / 529-class errors, not 429 quota errors.
Make max retry window configurable (env var or setting).
Skip auto-retry if the last tool call was destructive and its response was dropped (avoid double-fire on writes).

extent analysis

TL;DR

Implement exponential backoff with retry for shared-capacity errors to handle transient API throttling without surfacing failure to the user immediately.

Guidance

Detect shared-capacity errors (distinct from quota exhaustion) and differentiate them from 429 quota errors to apply retry logic only when appropriate.
Implement exponential backoff with a bounded window (e.g., up to 60s) to retry requests that fail due to shared-capacity throttling, starting with a short delay (e.g., 250ms) and doubling it until the maximum window is reached.
Display a "waiting on capacity" indicator to users during the retry window to prevent the perception of the system being hung.
Consider making the maximum retry window configurable via an environment variable or setting to allow for flexibility in handling different scenarios.

Example

import time
import random

def retry_shared_capacity_error(max_retries=5, initial_delay=0.25, max_delay=60):
    delay = initial_delay
    for attempt in range(max_retries):
        try:
            # API call here
            return
        except SharedCapacityError:
            print("Shared capacity error, retrying in {:.2f} seconds".format(delay))
            time.sleep(delay)
            delay = min(delay * 2, max_delay)
    raise Exception("Failed after {} retries".format(max_retries))

Notes

This approach assumes that the API's shared-capacity throttling is indeed transient and that retrying with exponential backoff will eventually succeed. It's crucial to monitor the effectiveness of this strategy and adjust parameters as needed to balance between retrying sufficiently and not overwhelming the API with repeated requests.

Recommendation

Apply workaround by implementing exponential backoff with retry for shared-capacity errors, as this directly addresses the issue of transient throttling without requiring changes to the API itself.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Shared-capacity throttles are transient and not caused by the user exceeding their plan. The client should:

Detect the shared-capacity error (distinct from quota exhaustion).
Retry with exponential backoff for a bounded window (e.g. up to 60s) before surfacing the failure.
Surface a visible "waiting on capacity" indicator during the retry window so users know it's recovering, not hung.

#api #chain error #conversation history #tool integration #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Auto-retry with backoff on shared-capacity rate limits instead of failing the turn [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Problem

Expected behavior

Why this matters

Suggested implementation notes

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Auto-retry with backoff on shared-capacity rate limits instead of failing the turn [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Problem

Expected behavior

Why this matters

Suggested implementation notes

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING