openclaw - 💡(How to fix) Fix [Bug]: AbortController error escapes runWithModelFallback — 429 abort silently drops session without fallback [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75856Fetched 2026-05-02 05:28:52
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
closed ×1commented ×1

When a provider returns a 429 and the embedded runner's AbortController fires shortly after (observed: ~31s), the resulting AbortError propagates to the outer catch block outside runWithModelFallback. The session terminates with openclaw:prompt-error "This operation was aborted" and the fallback chain is never entered.

Error Message

  1. Cerebras returns HTTP 429 at T+12s.
  2. shouldBypassLongSdkRetry() returns true (bare 429, no Retry-After header) — x-should-retry: false.
  3. At T+43s (31s later), openclaw:prompt-error "This operation was aborted" is appended to the session JSONL.
  4. Session terminates. No failover to any configured fallback provider. No model_change event. No auth-state update.

Root Cause

When a provider returns a 429 and the embedded runner's AbortController fires shortly after (observed: ~31s), the resulting AbortError propagates to the outer catch block outside runWithModelFallback. The session terminates with openclaw:prompt-error "This operation was aborted" and the fallback chain is never entered.

Fix Action

Workaround

Swap primary/fallback order so the stable provider is primary. The abort escapes regardless of rateLimitedProfileRotations setting.

Code Example

promptError = err; promptErrorSource = "prompt"  // set in catch block OUTSIDE runWithModelFallback

---

"cerebras:default": {
  "errorCount": 0,
  "lastFailureAt": <previous date, unchanged>
}

---

"auth": {
  "order": { "cerebras": ["cerebras:default", "cerebras:key2", "cerebras:key3"] },
  "cooldowns": { "rateLimitedProfileRotations": 0 }
},
"agents": {
  "defaults": {
    "model": {
      "primary": "cerebras/qwen-3-235b-a22b-instruct-2507",
      "fallbacks": ["mistral/mistral-large-latest", "groq/llama-3.3-70b-versatile", "google/gemini-2.5-flash"]
    }
  }
}
RAW_BUFFERClick to expand / collapse

Summary

When a provider returns a 429 and the embedded runner's AbortController fires shortly after (observed: ~31s), the resulting AbortError propagates to the outer catch block outside runWithModelFallback. The session terminates with openclaw:prompt-error "This operation was aborted" and the fallback chain is never entered.

Version

openclaw 2026.4.26 (be8c246)

Observed behavior

  1. Cerebras returns HTTP 429 at T+12s.
  2. shouldBypassLongSdkRetry() returns true (bare 429, no Retry-After header) — x-should-retry: false.
  3. At T+43s (31s later), openclaw:prompt-error "This operation was aborted" is appended to the session JSONL.
  4. Session terminates. No failover to any configured fallback provider. No model_change event. No auth-state update.

From selection-D9uTvvsw.js (line 7622):

promptError = err; promptErrorSource = "prompt"  // set in catch block OUTSIDE runWithModelFallback

The AbortError is caught after runWithModelFallback has already returned control — the fallback wrapper never sees the error.

auth-state.json evidence

After the failed session, auth-state.json shows:

"cerebras:default": {
  "errorCount": 0,
  "lastFailureAt": <previous date, unchanged>
}

The failure accounting layer is never reached. errorCount stays 0.

Config

"auth": {
  "order": { "cerebras": ["cerebras:default", "cerebras:key2", "cerebras:key3"] },
  "cooldowns": { "rateLimitedProfileRotations": 0 }
},
"agents": {
  "defaults": {
    "model": {
      "primary": "cerebras/qwen-3-235b-a22b-instruct-2507",
      "fallbacks": ["mistral/mistral-large-latest", "groq/llama-3.3-70b-versatile", "google/gemini-2.5-flash"]
    }
  }
}

Expected behavior

An AbortError triggered by a 429 → retry-bypass → abort sequence should:

  1. Update auth-state (increment errorCount, set lastFailureAt)
  2. Trigger the model fallback chain (model_change event, route to next provider)

Workaround

Swap primary/fallback order so the stable provider is primary. The abort escapes regardless of rateLimitedProfileRotations setting.

Related

  • shouldBypassLongSdkRetry() in transport-stream-shared-B2gu_mC8.js — returns true immediately for bare 429 (no Retry-After)
  • runWithModelFallback wrapper in selection-D9uTvvsw.js — the abort escapes this wrapper

extent analysis

TL;DR

The AbortError triggered by a 429 response is not properly handled by the runWithModelFallback wrapper, causing the session to terminate without entering the fallback chain.

Guidance

  • The issue seems to be related to the AbortController firing too quickly after the 429 response, preventing the fallback chain from being triggered.
  • To verify, check the timing of the AbortController firing and the 429 response to ensure that the fallback chain is not being triggered due to the AbortError being caught outside the runWithModelFallback wrapper.
  • Consider modifying the shouldBypassLongSdkRetry() function to return false for bare 429 responses without a Retry-After header, allowing the fallback chain to be triggered.
  • Review the runWithModelFallback wrapper to ensure that it is properly handling the AbortError and triggering the fallback chain as expected.

Example

No code snippet is provided as the issue is more related to the logic and timing of the AbortController and the runWithModelFallback wrapper.

Notes

The provided workaround of swapping the primary and fallback order may not be a permanent solution and may have unintended consequences. A more robust solution would be to address the root cause of the issue, which is the AbortError being caught outside the runWithModelFallback wrapper.

Recommendation

Apply workaround: Swap primary/fallback order to ensure that the stable provider is primary, allowing the fallback chain to be triggered in case of an AbortError. This is a temporary solution until the root cause of the issue can be addressed.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

An AbortError triggered by a 429 → retry-bypass → abort sequence should:

  1. Update auth-state (increment errorCount, set lastFailureAt)
  2. Trigger the model fallback chain (model_change event, route to next provider)

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING