openclaw - 💡(How to fix) Fix [Bug]: FailoverError — CLI timeout after 600s with no output, failover never triggers, TUI/webchat unresponsive [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#59024Fetched 2026-04-08 02:29:46
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

When the primary model fails or hangs, the agent produces FailoverError: CLI produced no output for 600s and was terminated instead of failing over to the configured fallback model. The session goes into error state and becomes completely unresponsive. The user cannot communicate with any model through TUI or webchat.

Error Message

run error: FailoverError: CLI produced no output for 600s and was terminated.
connected | error

Root Cause

When the primary model fails or hangs, the agent produces FailoverError: CLI produced no output for 600s and was terminated instead of failing over to the configured fallback model. The session goes into error state and becomes completely unresponsive. The user cannot communicate with any model through TUI or webchat.

Code Example

run error: FailoverError: CLI produced no output for 600s and was terminated.
connected | error
RAW_BUFFERClick to expand / collapse

Summary

When the primary model fails or hangs, the agent produces FailoverError: CLI produced no output for 600s and was terminated instead of failing over to the configured fallback model. The session goes into error state and becomes completely unresponsive. The user cannot communicate with any model through TUI or webchat.

Error

run error: FailoverError: CLI produced no output for 600s and was terminated.
connected | error

Expected behavior

If the primary model produces no output for an extended period, the failover model should be tried automatically and transparently. The session should recover, not error out.

Actual behavior

  • Session enters error state
  • No failover to configured fallback model
  • TUI and webchat both unresponsive
  • User must manually exit, start new session, and reconfigure model
  • With 5+ broken crons also firing, the error compounds across all sessions simultaneously

Context

This is part of a cascade of failures introduced by 2026.4.1. MiniMax (primary model) became unreachable after the update. Despite a fallback model being configured, no failover occurred. The 600s timeout means users wait 10 minutes before discovering the session is dead.

Related issues from 2026.4.1

#57437, #58881, #58885, #59003, #59006, #59008, #59010, #59014, #59017, #59018

Environment

  • Version: 2026.4.1
  • Platform: macOS
  • Primary model: minimax/MiniMax-M2.5 (unreachable)
  • Fallback: kimi-coding/k2p5
  • Sessions affected: webchat and TUI

extent analysis

TL;DR

The issue can be mitigated by adjusting the timeout configuration or investigating the unreachable primary model, minimax/MiniMax-M2.5, to ensure failover to the fallback model, kimi-coding/k2p5, occurs as expected.

Guidance

  • Investigate the primary model, minimax/MiniMax-M2.5, to determine why it became unreachable after the 2026.4.1 update, as this is the root cause of the failover issue.
  • Review the configuration for the failover mechanism to ensure it is correctly set up to switch to the fallback model, kimi-coding/k2p5, when the primary model times out or fails.
  • Consider temporarily reducing the 600s timeout to a lower value to reduce the wait time for users before the session is considered dead and potentially allow for quicker failover attempts.
  • Examine the related issues (#57437, #58881, #58885, #59003, #59006, #59008, #59010, #59014, #59017, #59018) for any insights or fixes that might apply to this problem.

Example

No specific code snippet can be provided without more details on the configuration or codebase, but checking the model's connection settings and the failover logic would be a good starting point.

Notes

The exact solution depends on the specifics of the model integration and the failover mechanism's implementation, which are not fully detailed in the issue. The primary model's unreachability after the update suggests a compatibility or configuration issue that needs addressing.

Recommendation

Apply a workaround by adjusting the timeout and investigating the primary model's unreachability, as upgrading to a fixed version is not explicitly mentioned as an option in the provided context.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

If the primary model produces no output for an extended period, the failover model should be tried automatically and transparently. The session should recover, not error out.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING