openclaw - 💡(How to fix) Fix [Bug]: Model fallback logs reason=unknown for OpenAI-compatible providers, silently escalates to primary model [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#59250Fetched 2026-04-08 02:26:55
View on GitHub
Comments
1
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
closed ×1commented ×1locked ×1

Error Message

When using OpenAI-compatible providers, sub-agent model requests that fail are logged with reason=unknown in fallback decisions. The actual upstream error is never surfaced, making debugging impossible. Gateway error log shows:

  • The reason=unknown indicates the error classifier cannot parse the upstream error format
  • The actual error (likely a non-standard HTTP error response from the proxy) is swallowed entirely
  1. The upstream error response body/status should be logged (at least at debug level) so users can diagnose proxy issues Related but distinct from #59213 (which covers infinite 429 loops from session model reconciliation). This issue is about the error classification gap and upward-escalation fallback direction for OpenAI-compatible provider errors.

Code Example

[model-fallback/decision] model fallback decision: decision=candidate_failed
  requested=anthropic-/anthropic--claude-4.5-haiku
  candidate=anthropic-/anthropic--claude-4.5-haiku
  reason=unknown
  next=anthropic-/anthropic--claude-4.6-opus
RAW_BUFFERClick to expand / collapse

Bug Description

When using OpenAI-compatible providers, sub-agent model requests that fail are logged with reason=unknown in fallback decisions. The actual upstream error is never surfaced, making debugging impossible.

Worse: the fallback chain escalates from cheap models (Haiku/Sonnet) → expensive primary (Opus), defeating the purpose of using lighter models for sub-agents.

Reproduction

  1. Configure an OpenAI-compatible provider with multiple Anthropic models routed through it
  2. Spawn sub-agents with a non-primary model like anthropic-/anthropic--claude-4.5-haiku
  3. If the proxy returns any non-200 response, observe fallback behavior

Observed Behavior

Gateway error log shows:

[model-fallback/decision] model fallback decision: decision=candidate_failed
  requested=anthropic-/anthropic--claude-4.5-haiku
  candidate=anthropic-/anthropic--claude-4.5-haiku
  reason=unknown
  next=anthropic-/anthropic--claude-4.6-opus
  • 11 occurrences in 2 days (7× Haiku→Opus, 4× Sonnet→Opus)
  • The reason=unknown indicates the error classifier cannot parse the upstream error format
  • The actual error (likely a non-standard HTTP error response from the proxy) is swallowed entirely

Expected Behavior

  1. The upstream error response body/status should be logged (at least at debug level) so users can diagnose proxy issues
  2. reason=unknown should include the raw HTTP status code at minimum
  3. Fallback should prefer models of similar tier first (Haiku→Sonnet→Opus) rather than jumping directly to the most expensive model

Additional Context

Related but distinct from #59213 (which covers infinite 429 loops from session model reconciliation). This issue is about the error classification gap and upward-escalation fallback direction for OpenAI-compatible provider errors.

extent analysis

TL;DR

Modify the error logging and classification to include the raw HTTP status code and response body for upstream errors, and adjust the fallback chain to prefer models of similar tier.

Guidance

  • Review the error classifier to handle non-standard HTTP error responses from the proxy and log the raw HTTP status code and response body at least at debug level.
  • Update the fallback logic to prefer models of similar tier (e.g., Haiku→Sonnet→Opus) instead of escalating directly to the most expensive model.
  • Verify that the modified logging and fallback behavior work as expected by reproducing the error scenario and checking the logs and fallback chain.
  • Consider adding additional logging or monitoring to detect and diagnose proxy issues that may be causing the upstream errors.

Example

# Pseudo-code example of modified error logging
def log_error(upstream_response):
    error_message = f"Upstream error: {upstream_response.status_code} {upstream_response.text}"
    logger.debug(error_message)
    # Log the error with the raw HTTP status code and response body

Notes

The exact implementation details may vary depending on the specific programming language and framework used. The example provided is a pseudo-code illustration of the modified error logging.

Recommendation

Apply workaround: Modify the error logging and classification to include the raw HTTP status code and response body, and adjust the fallback chain to prefer models of similar tier. This will allow for better debugging and diagnosis of proxy issues and prevent unnecessary escalation to expensive models.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Model fallback logs reason=unknown for OpenAI-compatible providers, silently escalates to primary model [1 comments, 1 participants]