openclaw - 💡(How to fix) Fix fix(model-fallback): distinguish empty_response / no_error_details / unclassified instead of collapsing to unknown [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#71922Fetched 2026-04-27 05:37:24
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Currently, three distinct failure scenarios all produce reason: "unknown" in model fallback events. Each has enough information to be labelled more specifically, but the label is discarded before surfacing to the user.

Error Message

The error object arrived with no body and no status code. This typically means the provider socket closed unexpectedly (Ollama worker OOM, process crash, CUDA error). Suggested reason: "empty_response" // isExactUnknownNoDetailsError matches: "unknown error (no error details in response)" The provider returned a well-formed response, but the error payload was explicitly empty. Different from a socket crash — the request completed, just with no useful error. Suggested reason: "no_error_details" The error message didn't match any known pattern (rate_limit, timeout, overload, billing, auth, context overflow, DNS, etc.). The raw message exists but no classifier claimed it. Suggested reason: "unclassified" // Path 3: catchall — at minimum preserve the raw error in errorPreview

Root Cause

  • Debugging: reason: unknown gives no actionable signal. reason: empty_response immediately points to a local model crash or network drop. reason: unclassified tells you to look at errorPreview for the raw text.
  • Retry logic: shouldAllowCooldownProbeForReason and shouldUseTransientCooldownProbeSlot already treat "unknown" as transient — splitting it lets each sub-type get the right cooldown policy independently.
  • Observability: errorPreview is sometimes empty when reason is unknown (especially for empty_response). Users see a fallback notice with no reason and no preview — zero debug signal.

Code Example

if (!message && typeof status !== "number") return "unknown";

---

if (isExactUnknownNoDetailsError(raw)) return toReasonClassification("unknown");
// isExactUnknownNoDetailsError matches: "unknown error (no error details in response)"

---

return "unknown"; // end of classifyProviderRuntimeFailureKind

---

// Path 1: no message, no status
if (!message && typeof status !== "number") return "empty_response";

// Path 2: explicit no-details response  
if (isExactUnknownNoDetailsError(raw)) return toReasonClassification("no_error_details");

// Path 3: catchall — at minimum preserve the raw error in errorPreview
return "unclassified";
RAW_BUFFERClick to expand / collapse

fix(model-fallback): distinguish empty_response / no_error_details / unclassified instead of collapsing to "unknown"

Summary

Currently, three distinct failure scenarios all produce reason: "unknown" in model fallback events. Each has enough information to be labelled more specifically, but the label is discarded before surfacing to the user.

The three collapsed paths (in errors-CJULmF31.js / classifyProviderRuntimeFailureKind)

1. Empty response — no message, no HTTP status

if (!message && typeof status !== "number") return "unknown";

The error object arrived with no body and no status code. This typically means the provider socket closed unexpectedly (Ollama worker OOM, process crash, CUDA error). Suggested reason: "empty_response"

2. Provider explicitly returned no details

if (isExactUnknownNoDetailsError(raw)) return toReasonClassification("unknown");
// isExactUnknownNoDetailsError matches: "unknown error (no error details in response)"

The provider returned a well-formed response, but the error payload was explicitly empty. Different from a socket crash — the request completed, just with no useful error. Suggested reason: "no_error_details"

3. Catchall — fell through all classifiers

return "unknown"; // end of classifyProviderRuntimeFailureKind

The error message didn't match any known pattern (rate_limit, timeout, overload, billing, auth, context overflow, DNS, etc.). The raw message exists but no classifier claimed it. Suggested reason: "unclassified"

Why this matters

  • Debugging: reason: unknown gives no actionable signal. reason: empty_response immediately points to a local model crash or network drop. reason: unclassified tells you to look at errorPreview for the raw text.
  • Retry logic: shouldAllowCooldownProbeForReason and shouldUseTransientCooldownProbeSlot already treat "unknown" as transient — splitting it lets each sub-type get the right cooldown policy independently.
  • Observability: errorPreview is sometimes empty when reason is unknown (especially for empty_response). Users see a fallback notice with no reason and no preview — zero debug signal.

Suggested fix

// Path 1: no message, no status
if (!message && typeof status !== "number") return "empty_response";

// Path 2: explicit no-details response  
if (isExactUnknownNoDetailsError(raw)) return toReasonClassification("no_error_details");

// Path 3: catchall — at minimum preserve the raw error in errorPreview
return "unclassified";

For path 3, even if the reason label stays "unknown", ensuring errorPreview is always populated with the raw message would be a meaningful improvement on its own.

Affected surfaces

  • Model fallback notice shown to users: ⚠️ Model Fallback: X (selected Y; unknown)
  • cron runs output: "reason": "unknown"
  • Gateway log: model_fallback_decision events with reason: "unknown"
  • Cooldown probe logic in shouldAllowCooldownProbeForReason / shouldUseTransientCooldownProbeSlot

Notes

  • "unknown" is currently in TRANSIENT_FALLBACK_REASONS — the three new labels should inherit the same transient treatment unless there's a reason to differentiate
  • This came up debugging Ollama (granite4:3b-h) worker crashes where the socket closes with no body — currently indistinguishable from a malformed provider response or a genuine catchall

extent analysis

TL;DR

Update the classifyProviderRuntimeFailureKind function to return more specific reason labels ("empty_response", "no_error_details", and "unclassified") instead of collapsing to "unknown".

Guidance

  • Review the errors-CJULmF31.js file and update the classifyProviderRuntimeFailureKind function to include the suggested reason labels.
  • Verify that the errorPreview field is populated with the raw error message, especially for the "unclassified" reason.
  • Test the updated function with different error scenarios to ensure the correct reason labels are returned.
  • Consider updating the TRANSIENT_FALLBACK_REASONS list to include the new reason labels, unless there's a reason to differentiate their transient treatment.

Example

if (!message && typeof status !== "number") return "empty_response";
if (isExactUnknownNoDetailsError(raw)) return toReasonClassification("no_error_details");
return "unclassified";

Notes

The suggested fix assumes that the isExactUnknownNoDetailsError function is correctly implemented and that the toReasonClassification function is available. Additionally, the update may require changes to downstream logic that relies on the "unknown" reason label.

Recommendation

Apply the suggested fix to update the classifyProviderRuntimeFailureKind function, as it provides more specific and actionable reason labels for debugging and retry logic.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix fix(model-fallback): distinguish empty_response / no_error_details / unclassified instead of collapsing to unknown [1 participants]