openclaw - 💡(How to fix) Fix model.fallbacks chain does not engage on reasoning-only / empty-visible-reply failures

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The gateway's reasoning-model retry path correctly classifies an empty-visible / reasoning-only turn (see dist/result-fallback-classifier-CrVa7J1V.js returning code: "reasoning_only_result" / "empty_result"). That classifier is also correctly passed as the classifyResult callback into runWithModelFallback (see dist/agent-command-CYAAR35V.js ~line 711). So at the architectural level, the wiring exists for reasoning-only outcomes to traverse the configured model.fallbacks chain.

But in practice, on our host, reasoning-only failures never traverse the fallback chain. The primary model retries twice with visible-answer continuation instructions, then surfaces incomplete-turn error to the user. The fallback chain (pace-openrouter/anthropic/claude-haiku-4-5, pace-openrouter/google/gemini-2.5-pro) is never attempted. We have ~24 hours of reasoning-only retries exhausted log entries for one provider with the same outcome at the top of every hour: error surfaced, no fallback step recorded.

Our hypothesis (without runtime trace logs) is that one of the early guards in classifyEmbeddedPiRunResultForModelFallback short-circuits to null on this code path — most likely hasOutboundDeliveryEvidence(params.result) returning true based on side-channel evidence (e.g., a typing indicator or a non-content delivery event) even though no real visible reply landed. We can't pinpoint exactly which guard tripped without source-side logging.

Error Message

But in practice, on our host, reasoning-only failures never traverse the fallback chain. The primary model retries twice with visible-answer continuation instructions, then surfaces incomplete-turn error to the user. The fallback chain (pace-openrouter/anthropic/claude-haiku-4-5, pace-openrouter/google/gemini-2.5-pro) is never attempted. We have ~24 hours of reasoning-only retries exhausted log entries for one provider with the same outcome at the top of every hour: error surfaced, no fallback step recorded. 5. Actual: runWithModelFallback surfaces the incomplete-turn error and exits without trying any fallback model. No model.fallback_step event is emitted. 2026-05-15T16:34:46 [agent/embedded] reasoning-only retries exhausted: runId=b1a75342... provider=openai-codex/gpt-5.4 attempts=2/2 — surfacing incomplete-turn error

  1. Audit the line 35-39 guards in classifyEmbeddedPiRunResultForModelFallback for whether they fire spuriously when the only payload in the result is the synthetic STRICT_AGENTIC_BLOCKED_TEXT error payload returned from the exhausted-retries branch in pi-embedded. In particular, hasOutboundDeliveryEvidence and hasDirectlySentBlockReply are good first suspects — both are concepts that an exhausted-retries run could plausibly trip without actually having delivered a real reply.

Root Cause

The gateway's reasoning-model retry path correctly classifies an empty-visible / reasoning-only turn (see dist/result-fallback-classifier-CrVa7J1V.js returning code: "reasoning_only_result" / "empty_result"). That classifier is also correctly passed as the classifyResult callback into runWithModelFallback (see dist/agent-command-CYAAR35V.js ~line 711). So at the architectural level, the wiring exists for reasoning-only outcomes to traverse the configured model.fallbacks chain.

But in practice, on our host, reasoning-only failures never traverse the fallback chain. The primary model retries twice with visible-answer continuation instructions, then surfaces incomplete-turn error to the user. The fallback chain (pace-openrouter/anthropic/claude-haiku-4-5, pace-openrouter/google/gemini-2.5-pro) is never attempted. We have ~24 hours of reasoning-only retries exhausted log entries for one provider with the same outcome at the top of every hour: error surfaced, no fallback step recorded.

Our hypothesis (without runtime trace logs) is that one of the early guards in classifyEmbeddedPiRunResultForModelFallback short-circuits to null on this code path — most likely hasOutboundDeliveryEvidence(params.result) returning true based on side-channel evidence (e.g., a typing indicator or a non-content delivery event) even though no real visible reply landed. We can't pinpoint exactly which guard tripped without source-side logging.

Fix Action

Fix / Workaround

We don't know enough about your guard semantics to propose a one-line patch, but a couple of directions:

Our Workaround

  • KI-2026-05-15-01 in our local known-issues registry (~/.openclaw/knowledge-base/processes/known-issues.md).
  • Related to the same architectural constraint as KI-2026-05-14-02 (minified npm dist; userspace workarounds are durable but the upstream patch is the real cure).

Code Example

2026-05-15T16:34:39 [agent/embedded] reasoning-only assistant turn detected: runId=b1a75342... provider=openai-codex/gpt-5.4 — retrying 1/2 with visible-answer continuation
2026-05-15T16:34:43 [agent/embedded] reasoning-only assistant turn detected: runId=b1a75342... provider=openai-codex/gpt-5.4 — retrying 2/2 with visible-answer continuation
2026-05-15T16:34:46 [agent/embedded] reasoning-only retries exhausted: runId=b1a75342... provider=openai-codex/gpt-5.4 attempts=2/2 — surfacing incomplete-turn error

---

[agent/embedded] fallback-classifier: provider=A/M result=null reason=hasOutboundDeliveryEvidence
[agent/embedded] fallback-classifier: provider=A/M result=reasoning_only_result -> attempting fallback A2/M2
RAW_BUFFERClick to expand / collapse

OpenClaw version observed: First observed on 2026.5.7; re-verified still present in 2026.5.18 (installed at ~/.npm-global/lib/node_modules/openclaw). On 2026.5.18 the symbols are at dist/result-fallback-classifier-Czg3Eu15.js (classifyEmbeddedPiRunResultForModelFallback), dist/agent-command-CbVp9Pf5.js (runWithModelFallback wire-in), and dist/pi-embedded-ydXicp1O.js (reasoningOnlyRetriesExhausted). Filed by: SKP Team (Shawn Powers, skpteam.casa) Date observed: First reasoning-only retries-exhausted in our gateway.err.log at 2026-05-04T06:03:51-05:00 (provider pace-openrouter/google/gemini-2.5-flash); continued daily; full agent outage 2026-05-14 18:00 → 2026-05-15 17:24 CT (~24h) on openai-codex/gpt-5.4. Severity (from user perspective): High — the agent's configured model.fallbacks chain exists exactly so the agent stays available when the primary fails. When the primary fails by emitting reasoning-only or empty-visible turns, the fallback chain is silently skipped and the user-visible agent is unreachable until a human notices.


Summary

The gateway's reasoning-model retry path correctly classifies an empty-visible / reasoning-only turn (see dist/result-fallback-classifier-CrVa7J1V.js returning code: "reasoning_only_result" / "empty_result"). That classifier is also correctly passed as the classifyResult callback into runWithModelFallback (see dist/agent-command-CYAAR35V.js ~line 711). So at the architectural level, the wiring exists for reasoning-only outcomes to traverse the configured model.fallbacks chain.

But in practice, on our host, reasoning-only failures never traverse the fallback chain. The primary model retries twice with visible-answer continuation instructions, then surfaces incomplete-turn error to the user. The fallback chain (pace-openrouter/anthropic/claude-haiku-4-5, pace-openrouter/google/gemini-2.5-pro) is never attempted. We have ~24 hours of reasoning-only retries exhausted log entries for one provider with the same outcome at the top of every hour: error surfaced, no fallback step recorded.

Our hypothesis (without runtime trace logs) is that one of the early guards in classifyEmbeddedPiRunResultForModelFallback short-circuits to null on this code path — most likely hasOutboundDeliveryEvidence(params.result) returning true based on side-channel evidence (e.g., a typing indicator or a non-content delivery event) even though no real visible reply landed. We can't pinpoint exactly which guard tripped without source-side logging.

Reproduction

Easily observed against any reasoning model that occasionally emits a hidden-reasoning-only response. Our crossing of the failure was:

  1. Configure an agent with a reasoning model as model.primary and a non-reasoning model as the first entry of model.fallbacks.
  2. Send a prompt that the primary handles by producing reasoning content but no visible content (which the gateway then classifies internally as "reasoning-only" and retries with nextReasoningOnlyRetryInstruction).
  3. Continue until both retry attempts also return reasoning-only.
  4. Expected: After reasoningOnlyRetriesExhausted fires, the run result is classified as reasoning_only_result, runWithModelFallback advances to the next provider/model in model.fallbacks, and the agent's reply is produced by the fallback.
  5. Actual: runWithModelFallback surfaces the incomplete-turn error and exits without trying any fallback model. No model.fallback_step event is emitted.

Two independent providers exhibited this on our host:

  • pace-openrouter/google/gemini-2.5-flash (first observed 2026-05-04, sporadic)
  • openai-codex/gpt-5.4 via the built-in openai-codex auth provider (continuous from 2026-05-14 18:00 to 2026-05-15 17:24 CT — every hourly agent turn for ~24h)

In both cases, the agent (Pace) was effectively unreachable until we hand-swapped model.primary to a non-reasoning provider. The fallback chain in model.fallbacks was configured the whole time and never engaged.

Affected Source

Grep across ~/.npm-global/lib/node_modules/openclaw/dist/:

  • dist/pi-embedded-Bcz04p2i.js ~line 3159 — increments reasoningOnlyRetryAttempts, retries with nextReasoningOnlyRetryInstruction.
  • Same file ~line 3165 — sets reasoningOnlyRetriesExhausted = true when reasoningOnlyRetryAttempts >= maxReasoningOnlyRetryAttempts.
  • Same file ~line 3223 — on reasoningOnlyRetriesExhausted && !finalAssistantVisibleText, returns a result with a single {isError: true, text: STRICT_AGENTIC_BLOCKED_TEXT} payload, marking the attempt terminal.
  • dist/result-fallback-classifier-CrVa7J1V.js line 33 (classifyEmbeddedPiRunResultForModelFallback):
    • Line 35-39: early returns null if params.result.meta.aborted, hasDirectlySentBlockReply, hasBlockReplyPipelineOutput, or hasVisibleAgentPayload(..., includeReasoningPayloads:false).
    • Line 39: also early returns null if hasOutboundDeliveryEvidence(params.result) is true.
    • Line 40-45: harness classification path returns reasoning_only_result / empty_result correctly when reached.
  • dist/agent-command-CYAAR35V.js line 711 — wires the above classifier into runWithModelFallback({ classifyResult: ... }).

The wiring is correct. The behavior we observe is consistent with one of the guards at lines 35-39 returning truthy when it shouldn't, given a reasoning-only-exhausted run.

Sample Log Lines

From our ~/.openclaw/logs/gateway.err.log:

2026-05-15T16:34:39 [agent/embedded] reasoning-only assistant turn detected: runId=b1a75342... provider=openai-codex/gpt-5.4 — retrying 1/2 with visible-answer continuation
2026-05-15T16:34:43 [agent/embedded] reasoning-only assistant turn detected: runId=b1a75342... provider=openai-codex/gpt-5.4 — retrying 2/2 with visible-answer continuation
2026-05-15T16:34:46 [agent/embedded] reasoning-only retries exhausted: runId=b1a75342... provider=openai-codex/gpt-5.4 attempts=2/2 — surfacing incomplete-turn error

Same shape, every hour, for ~24 hours. No model.fallback_step event in the transcript or trajectory recorder for any of these runs.

What Helpful Trace Output Would Look Like

To self-diagnose without source modifications, it would help if the gateway emitted (at debug level) the actual return value of classifyEmbeddedPiRunResultForModelFallback per fallback evaluation, including which guard short-circuited it to null. Right now the run just ends and there's no way to tell why the chain didn't advance.

Ideally:

[agent/embedded] fallback-classifier: provider=A/M result=null reason=hasOutboundDeliveryEvidence
[agent/embedded] fallback-classifier: provider=A/M result=reasoning_only_result -> attempting fallback A2/M2

Proposed Fix Direction

We don't know enough about your guard semantics to propose a one-line patch, but a couple of directions:

  1. Audit the line 35-39 guards in classifyEmbeddedPiRunResultForModelFallback for whether they fire spuriously when the only payload in the result is the synthetic STRICT_AGENTIC_BLOCKED_TEXT error payload returned from the exhausted-retries branch in pi-embedded. In particular, hasOutboundDeliveryEvidence and hasDirectlySentBlockReply are good first suspects — both are concepts that an exhausted-retries run could plausibly trip without actually having delivered a real reply.

  2. Or, on the call-site side in pi-embedded-Bcz04p2i.js ~line 3223, when returning the exhausted-retries result, ensure meta.agentHarnessResultClassification is explicitly set to "reasoning-only" and that no spurious outbound-evidence / block-reply flags are set on the result. The classifier's harness-classification path (line 14-31) would then catch this case unambiguously and return reasoning_only_result, which runWithModelFallback should honor.

  3. Add the debug trace described above so operators can self-diagnose future fallback-not-firing situations.

Our Workaround

  1. Hand-swap model.primary in openclaw.json to a non-reasoning provider, kickstart the gateway. (We're currently running Pace on pace-openrouter/anthropic/claude-sonnet-4-6 with pace-openrouter/google/gemini-2.5-pro as the fallback.)
  2. Added a userspace monitor that tails gateway.err.log for reasoning-only retries exhausted / incomplete turn detected.*payloads=0 / context overflow detected.*attempt [2-9], extracts the provider, and posts a 60-min-cooldown alert to a designated Discord channel. We were running blind on this failure mode for ~24 hours before noticing via downstream symptoms; the monitor surfaces it within minutes now.

We'd switch back to relying on model.fallbacks as soon as the fallback path engages on reasoning-only / empty-visible failures.

Cross-Reference

  • KI-2026-05-15-01 in our local known-issues registry (~/.openclaw/knowledge-base/processes/known-issues.md).
  • Related to the same architectural constraint as KI-2026-05-14-02 (minified npm dist; userspace workarounds are durable but the upstream patch is the real cure).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix model.fallbacks chain does not engage on reasoning-only / empty-visible-reply failures