openclaw - ✅(Solved) Fix Failover ladder retries sibling Gemini models on the same billing account when the cap is account-scoped, multiplying RESOURCE_EXHAUSTED noise [1 pull requests, 1 comments, 2 participants]

openclaw2026-05-10 23:25:50

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#80451•Fetched 2026-05-11 03:14:30

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ScientificProgrammer

Participants

clawsweeper[bot]

ScientificProgrammer

Timeline (top)

commented ×1cross-referenced ×1

When google/gemini-3.1-pro-preview returns 429 RESOURCE_EXHAUSTED due to the monthly billing-account spending cap, OpenClaw's failover ladder retries google/gemini-2.5-flash and google/gemini-2.5-pro even though all three Gemini models share the same Google billing account and therefore the same cap. Each sibling model's attempt produces additional RESOURCE_EXHAUSTED lines for a billing condition that is structurally guaranteed not to recover by switching siblings.

In a real-world cascade I observed across a 7-day window, this produced 15 RESOURCE_EXHAUSTED lines per failed run, repeated across 26 distinct runs over ~10 hours, for a total of ~390 cascade-amplified events from a single billing condition. The actual primary failover target — OpenAI — would have had a chance to handle the conversation if it weren't for a separate replay bug (filed separately).

Error Message

openai/gpt-5.4: <separate format error, see related issue> | Failover decisions for all three Gemini siblings carry reason=rate_limit despite the underlying error being RESOURCE_EXHAUSTED + "monthly spending cap" (a separate classification issue worth tracking on its own; see PR #74120 for in-flight billing classification work that targets LiteLLM-style budget errors but doesn't yet cover Google's native shape).

A simpler heuristic: when a model returns a billing-class error (after PR #74120 ships native Google classification), set a short-lived provider+account cooldown that suppresses sibling attempts for the rest of the run and a host-level cooldown of N minutes (overlaps PR #64127's circuit breaker work).

Root Cause

Fix Action

Fixed

Fixed by PR: feat: Provider circuit breaker for quota exhaustion (https://github.com/openclaw/openclaw/pull/64127)

PR fix notes

PR #64127: feat: Provider circuit breaker for quota exhaustion

Repository: openclaw/openclaw
Author: dodge1218
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/64127

Description (problem / solution / changelog)

Resolves #64085

This PR introduces proper handling for daily/weekly/monthly quota exhaustion errors:

Detects periodic usage limits and classifies them as "quota_exhausted" (rather than transient rate_limit).
Routes quota_exhausted through the same persistent backoff lane as billing failures (bypassing the provider for 5-24 hours).
Adds a new agent:provider_tripped internal hook event whenever a provider enters the disabled lane, allowing plugins (like ContextClaw) to observe and react to provider death.

Tested via local inspection; handles the Gemini 429 loops by correctly stepping back for the day.

Changed files

apps/macos/Sources/OpenClawProtocol/GatewayModels.swift (modified, +22/-0)
apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift (modified, +22/-0)
dist/protocol.schema.json (added, +12738/-0)
scripts/e2e/lib/doctor-install-switch/scenario.sh (modified, +1/-1)
src/agents/auth-profiles/state-observation.ts (modified, +18/-1)
src/agents/auth-profiles/types.ts (modified, +1/-0)
src/agents/auth-profiles/usage.test.ts (modified, +18/-0)
src/agents/auth-profiles/usage.ts (modified, +14/-2)
src/agents/failover-error.test.ts (modified, +9/-3)
src/agents/failover-error.ts (modified, +1/-0)
src/agents/failover-policy.ts (modified, +1/-0)
src/agents/model-fallback.probe.test.ts (modified, +55/-1)
src/agents/model-fallback.ts (modified, +13/-0)
src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts (modified, +7/-5)
src/agents/pi-embedded-helpers/errors.ts (modified, +4/-1)
src/agents/pi-embedded-helpers/types.ts (modified, +1/-0)
src/agents/runtime-plan/types.ts (modified, +1/-0)
src/hooks/internal-hooks.ts (modified, +25/-0)

Code Example

model_stack:
     - google/gemini-3.1-pro-preview
     - openai/gpt-5.4
     - google/gemini-2.5-flash
     - google/gemini-2.5-pro

---

Embedded agent failed before reply: All models failed (4):
  google/gemini-3.1-pro-preview: 429 RESOURCE_EXHAUSTED [monthly spending cap] (rate_limit) |
  openai/gpt-5.4:                <separate format error, see related issue> |
  google/gemini-2.5-flash:       429 RESOURCE_EXHAUSTED [monthly spending cap] (rate_limit) |
  google/gemini-2.5-pro:         429 RESOURCE_EXHAUSTED [monthly spending cap] (rate_limit)

RAW_BUFFERClick to expand / collapse

Summary

Reproduction

Configure two or more Gemini models from the same billing account in the failover chain. Example:

model_stack:
  - google/gemini-3.1-pro-preview
  - openai/gpt-5.4
  - google/gemini-2.5-flash
  - google/gemini-2.5-pro

Trigger sustained traffic that pushes the Google billing account past its monthly spending cap.
Observe failover ladder behavior in the gateway journal.

Observed

Embedded agent failed before reply: All models failed (4):
  google/gemini-3.1-pro-preview: 429 RESOURCE_EXHAUSTED [monthly spending cap] (rate_limit) |
  openai/gpt-5.4:                <separate format error, see related issue> |
  google/gemini-2.5-flash:       429 RESOURCE_EXHAUSTED [monthly spending cap] (rate_limit) |
  google/gemini-2.5-pro:         429 RESOURCE_EXHAUSTED [monthly spending cap] (rate_limit)

Failover decisions for all three Gemini siblings carry reason=rate_limit despite the underlying error being RESOURCE_EXHAUSTED + "monthly spending cap" (a separate classification issue worth tracking on its own; see PR #74120 for in-flight billing classification work that targets LiteLLM-style budget errors but doesn't yet cover Google's native shape).

Expected

The failover ladder should be aware of billing-account scope: when one model on a billing account hits a cap, sibling models on the same billing account should be considered "same failure class" and skipped without contacting the API. Possible shapes:

A billingAccount field on each provider entry (default = inferred from provider+account-id), with a "skip same-billing-account siblings on RESOURCE_EXHAUSTED / billing-class errors" policy.
A simpler heuristic: when a model returns a billing-class error (after PR #74120 ships native Google classification), set a short-lived provider+account cooldown that suppresses sibling attempts for the rest of the run and a host-level cooldown of N minutes (overlaps PR #64127's circuit breaker work).

Why filing

This is a defect-class bug — the failover ladder produces guaranteed-to-fail attempts whose only effect is journal noise and (transient, until cap kicks in) extra API hits. It's adjacent to but distinct from:

PR #64127 (provider circuit breaker for quota exhaustion) — host-level circuit breaker; would suppress runs but not the within-run sibling attempts.
PR #78086 (state-aware failover and lane suspension) — session-level suspension; same scope concern.
PR #74120 (classify budget-exceeded as billing) — fixes classification of LiteLLM proxy errors; doesn't address sibling scope.

None of these address the structural "siblings share billing account" insight, hence this issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #parallel task #integration issue #index setup #retrieval issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - ✅(Solved) Fix Failover ladder retries sibling Gemini models on the same billing account when the cap is account-scoped, multiplying RESOURCE_EXHAUSTED noise [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #64127: feat: Provider circuit breaker for quota exhaustion

Description (problem / solution / changelog)

Changed files

Code Example

Summary

Reproduction

Observed

Expected

Why filing

Still need to ship something?

TRENDING

openclaw - ✅(Solved) Fix Failover ladder retries sibling Gemini models on the same billing account when the cap is account-scoped, multiplying RESOURCE_EXHAUSTED noise [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #64127: feat: Provider circuit breaker for quota exhaustion

Description (problem / solution / changelog)

Changed files

Code Example

Summary

Reproduction

Observed

Expected

Why filing

Still need to ship something?

RELATED_DISCOVERY

TRENDING