openclaw - 💡(How to fix) Fix [Bug]: Model fallback not triggered on HTTP 500 errors, leading to system unresponsiveness [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#81266Fetched 2026-05-14 03:33:52
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
2
Author
Timeline (top)
labeled ×2closed ×1commented ×1mentioned ×1

I encountered an issue where the system became completely unresponsive when the primary model returned an HTTP 500 Internal Server Error, despite having a configured model ladder (fallbacks). The expected behavior was for the gateway to trigger a failover to the secondary models (e.g., Ollama), but instead, the request simply failed without any response.

Error Message

I encountered an issue where the system became completely unresponsive when the primary model returned an HTTP 500 Internal Server Error, despite having a configured model ladder (fallbacks). The expected behavior was for the gateway to trigger a failover to the secondary models (e.g., Ollama), but instead, the request simply failed without any response. 2. Trigger a scenario where the primary provider returns an HTTP 500 Internal Server Error. Any HTTP 5xx error (specifically 500, 502, 504) should reliably trigger the classifyFailoverReason logic and initiate a switch to the next model in the fallback ladder to ensure high availability. The system became completely unresponsive when the primary model returned an HTTP 500 Internal Server Error, despite having a configured model ladder (fallbacks). The system became completely unresponsive when the primary model returned an HTTP 500 Internal Server Error. Upon inspecting the compiled distribution code, it appears that the logic to handle 500 errors is already present but may not be executing as expected in certain scenarios (e.g., when the error is wrapped in a JSON body or returned as a non-standard HTML page).

Root Cause

I encountered an issue where the system became completely unresponsive when the primary model returned an HTTP 500 Internal Server Error, despite having a configured model ladder (fallbacks). The expected behavior was for the gateway to trigger a failover to the secondary models (e.g., Ollama), but instead, the request simply failed without any response.

RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

I encountered an issue where the system became completely unresponsive when the primary model returned an HTTP 500 Internal Server Error, despite having a configured model ladder (fallbacks). The expected behavior was for the gateway to trigger a failover to the secondary models (e.g., Ollama), but instead, the request simply failed without any response.

Steps to reproduce

  1. Configure a model ladder with a primary provider (e.g., Google Gemini) and fallback providers (e.g., Ollama).
  2. Trigger a scenario where the primary provider returns an HTTP 500 Internal Server Error.
  3. Observe that the system does not switch to the fallback model and remains unresponsive.

Expected behavior

Any HTTP 5xx error (specifically 500, 502, 504) should reliably trigger the classifyFailoverReason logic and initiate a switch to the next model in the fallback ladder to ensure high availability.

Actual behavior

The system became completely unresponsive when the primary model returned an HTTP 500 Internal Server Error, despite having a configured model ladder (fallbacks).

OpenClaw version

v2026.5.7

Operating system

Ubuntu 24.04

Install method

npm install -g openclaw@latest

Model

google/gemma-4-31b-it

Provider / routing chain

openclaw -> google

Additional provider/model setup details

Primary Model: google/gemma-4-31b-it Fallbacks: ollama/gemma4:31b-cloud

Logs, screenshots, and evidence

Impact and severity

The system became completely unresponsive when the primary model returned an HTTP 500 Internal Server Error.

Additional information

Upon inspecting the compiled distribution code, it appears that the logic to handle 500 errors is already present but may not be executing as expected in certain scenarios (e.g., when the error is wrapped in a JSON body or returned as a non-standard HTML page).

In dist/errors-BqFqz2qx.js (which corresponds to src/errors.ts), the following logic exists:

if (status === 500 || status === 502 || status === 504) return toReasonClassification("timeout");

This indicates that HTTP 500, 502, and 504 errors are intended to be classified as timeout, which should normally trigger the failover mechanism. However, in practice, this classification is either not being reached or the subsequent failover process is failing silently.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Any HTTP 5xx error (specifically 500, 502, 504) should reliably trigger the classifyFailoverReason logic and initiate a switch to the next model in the fallback ladder to ensure high availability.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Model fallback not triggered on HTTP 500 errors, leading to system unresponsiveness [1 comments, 2 participants]