openclaw - ✅(Solved) Fix [Bug]: TUI stops displaying output after fallback succeeds [4 pull requests, 1 comments, 2 participants]

jssda · 2026-04-02T09:00:32Z

[openclaw] When configured with fallback and multiple API key rotation, TUI stops displaying output after receiving an error event, even when the background mo… When configured with fallback and multiple API key rotation, TUI stops displaying output after receiving an error event, even when the background model fallback is still executing and eventually succeeds. User has to manually refresh the page to see the result. # PR #59582: fix(tui): keep active run alive across fallback error events - Repository: openclaw/openclaw - Author: andyliu - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/59582 ## Description (problem / solution / changelog) ## Summary - keep the active TUI run open when a chat error event arrives mid-run - continue rendering the eventual final output when fallback succeeds - add a regression test covering error-then-final fallback behavior ## Problem TUI currently terminates the active run on , which drops stream state before model fallback finishes. When fallback succeeds later, the final output is no longer displayed. ## Fix - only terminate non-active errored runs - keep the active run state intact and show a waiting-for-fallback system note - preserve the later event so the fallback result can render normally ## Testing - > openclaw@2026.3.24 test /Users/plaud/workspace/openclaw-59570 > node scripts/test-parallel.mjs -- src/tui/tui-event-handlers.test.ts [test-parallel] start unit workers=2 filters=1 RUN v4.1.0 /Users/plaud/workspace/openclaw-59570 Test Files 1 passed (1) Tests 22 passed (22) Start at 17:25:40 Duration 2.29s (transform 786ms, setup 2.22s, import 5ms, tests 8ms, environment 0ms) [test-parallel] done unit code=0 elapsed=2.8s Closes #59570 ## Changed files - `src/tui/tui-event-handlers.test.ts` (modified, +35/-0) - `src/tui/tui-event-handlers.ts` (modified, +9/-3) --- # PR #59800: fix(tui): preserve pending sends and busy-state visibility - Repository: openclaw/openclaw - Author: vincentkoc - State: closed | merged: True - Link: https://github.com/openclaw/openclaw/pull/59800 ## Description (problem / solution / changelog) ## Summary - Problem: the TUI could lose track of optimistic local sends during history reload/reconnect paths, show confusing busy/error state during fallback/terminal-error transitions, and waste horizontal width on long links and paths. - Why it matters: users could see prompts disappear and later reappear, get stuck in unclear run state, and struggle to read or copy long terminal output. - What changed: pending local user turns are preserved and reconciled through transcript rebuilds, active-run/error cleanup is more coherent, `Esc`/editor handling is covered more directly, and chat rendering reclaims width for long links and paths. - What did NOT change (scope boundary): this PR does not add full Pi-style runtime-owned steer/follow-up queues or a new pending queue panel; it stays focused on stabilizing the existing TUI state model. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [ ] API / contracts - [x] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Related #59014 - Related #59627 - Related #59570 - Related #55300 - [x] This PR fixes a bug or regression ## Root Cause / Regression History (if applicable) - Root cause: the TUI relied on optimistic local-send state that was not durably reconciled with history rebuilds and run lifecycle transitions, so reload/error paths could desynchronize visible user turns from real run state. - Missing detection / guardrail: there was no focused coverage for pending-send reconciliation across history rebuilds and not enough direct tests around the TUI error/final cleanup paths. - Prior context (`git blame`, prior PR, issue, or refactor if known): issue reports in #59014, #59627, #59570, and #55300 all point at state-coherence failures in the TUI. - Why this regressed now: the optimistic-send path assumed fast run attribution and simple finalization, which breaks under reconnect/history replay and fallback/error timing. - If unknown, what was ruled out: not just markdown rendering; the disappearing-send behavior came from TUI transcript/state handling. ## Regression Test Plan (if applicable) - Coverage level that should have caught this: - [x] Unit test - [ ] Seam / integration test - [ ] End-to-end test - [ ] Existing coverage already sufficient - Target test or file: `src/tui/components/chat-log.test.ts`, `src/tui/tui-command-handlers.test.ts`, `src/tui/tui-event-handlers.test.ts`, `src/tui/tui-session-actions.test.ts`, `src/tui/tui.test.ts`, `src/tui/components/custom-editor.test.ts` - Scenario the test should lock in: pending local sends survive history rebuilds until a matching run is anchored or dropped, run/error cleanup r

openclaw2026-04-02 09:00:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#59570•Fetched 2026-04-08 02:43:05

View on GitHub

Comments

Participants

Timeline

Reactions

Author

jssda

Participants

joelnishanth

jssda

Timeline (top)

cross-referenced ×4labeled ×2referenced ×2closed ×1

Error Message

When configured with fallback and multiple API key rotation, TUI stops displaying output after receiving an error event, even when the background model fallback is still executing and eventually succeeds. User has to manually refresh the page to see the result. 4. Observe TUI displays run error: API rate limit reached TUI shows error and immediately stops output. Gateway logs show fallback actually succeeded in the background, but TUI does not display the result. User can only see the latest execution result by clicking refresh on the web page. openclaw -> custom provider -> minimax2.5 (error) -> custom provider -> quwen3-235b

When TUI receives error event, it calls terminateRun() -> streamAssembler.drop(runId) Suggested fix: Modify TUI to not immediately terminate the run on error. Instead, mark it as "waiting for fallback" and continue displaying output when fallback succeeds. if (evt.state === "error") { terminateRun({ status: "error" }); // ← drops state if (evt.state === "error") {

Root Cause

This is a behavior bug, not a crash. Root cause analysis:

Fix Action

Fixed

Fixed by PR: fix(tui): keep active run alive across fallback error events (https://github.com/openclaw/openclaw/pull/59582)
Fixed by PR: fix(tui): preserve pending sends and busy-state visibility (https://github.com/openclaw/openclaw/pull/59800)
Fixed by PR: fix(tui): preserve fallback display after error events (https://github.com/openclaw/openclaw/pull/59819)
Fixed by PR: fix: don't broadcast state:error on per-attempt lifecycle errors (https://github.com/openclaw/openclaw/pull/60043)

PR fix notes

PR #59582: fix(tui): keep active run alive across fallback error events

Repository: openclaw/openclaw
Author: andyliu
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/59582

Description (problem / solution / changelog)

Summary

keep the active TUI run open when a chat error event arrives mid-run
continue rendering the eventual final output when fallback succeeds
add a regression test covering error-then-final fallback behavior

Problem

TUI currently terminates the active run on , which drops stream state before model fallback finishes. When fallback succeeds later, the final output is no longer displayed.

Fix

only terminate non-active errored runs
keep the active run state intact and show a waiting-for-fallback system note
preserve the later event so the fallback result can render normally

Testing

[email protected] test /Users/plaud/workspace/openclaw-59570 node scripts/test-parallel.mjs -- src/tui/tui-event-handlers.test.ts

[test-parallel] start unit workers=2 filters=1

RUN v4.1.0 /Users/plaud/workspace/openclaw-59570

Test Files 1 passed (1) Tests 22 passed (22) Start at 17:25:40 Duration 2.29s (transform 786ms, setup 2.22s, import 5ms, tests 8ms, environment 0ms)

[test-parallel] done unit code=0 elapsed=2.8s

Closes #59570

Changed files

src/tui/tui-event-handlers.test.ts (modified, +35/-0)
src/tui/tui-event-handlers.ts (modified, +9/-3)

PR #59800: fix(tui): preserve pending sends and busy-state visibility

Repository: openclaw/openclaw
Author: vincentkoc
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/59800

Description (problem / solution / changelog)

Summary

Problem: the TUI could lose track of optimistic local sends during history reload/reconnect paths, show confusing busy/error state during fallback/terminal-error transitions, and waste horizontal width on long links and paths.
Why it matters: users could see prompts disappear and later reappear, get stuck in unclear run state, and struggle to read or copy long terminal output.
What changed: pending local user turns are preserved and reconciled through transcript rebuilds, active-run/error cleanup is more coherent, Esc/editor handling is covered more directly, and chat rendering reclaims width for long links and paths.
What did NOT change (scope boundary): this PR does not add full Pi-style runtime-owned steer/follow-up queues or a new pending queue panel; it stays focused on stabilizing the existing TUI state model.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Related #59014
Related #59627
Related #59570
Related #55300
This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

Root cause: the TUI relied on optimistic local-send state that was not durably reconciled with history rebuilds and run lifecycle transitions, so reload/error paths could desynchronize visible user turns from real run state.
Missing detection / guardrail: there was no focused coverage for pending-send reconciliation across history rebuilds and not enough direct tests around the TUI error/final cleanup paths.
Prior context (git blame, prior PR, issue, or refactor if known): issue reports in #59014, #59627, #59570, and #55300 all point at state-coherence failures in the TUI.
Why this regressed now: the optimistic-send path assumed fast run attribution and simple finalization, which breaks under reconnect/history replay and fallback/error timing.
If unknown, what was ruled out: not just markdown rendering; the disappearing-send behavior came from TUI transcript/state handling.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/tui/components/chat-log.test.ts, src/tui/tui-command-handlers.test.ts, src/tui/tui-event-handlers.test.ts, src/tui/tui-session-actions.test.ts, src/tui/tui.test.ts, src/tui/components/custom-editor.test.ts
Scenario the test should lock in: pending local sends survive history rebuilds until a matching run is anchored or dropped, run/error cleanup returns the TUI to a coherent state, and the editor key handling stays stable.
Why this is the smallest reliable guardrail: these are TUI-local state-machine bugs, so focused unit coverage hits the failure path directly without network flake.
Existing test that already covers this (if any): N/A
If no new test is added, why not: N/A

User-visible / Behavior Changes

Pending local sends no longer disappear during history reload/reconnect windows.
Busy/error state is less likely to get stuck or look idle at the wrong time.
Long links and paths get more usable width in chat rendering.
TUI editor key handling now has direct regression coverage.

Diagram (if applicable)

Before:
[user send] -> [optimistic local state] -> [history reload or error transition] -> [message/status can disappear or desync]

After:
[user send] -> [tracked pending local state] -> [history rebuild reconciles it] -> [run anchors or drops explicitly] -> [status stays coherent]

Security Impact (required)

New permissions/capabilities? (Yes/No): No
Secrets/tokens handling changed? (Yes/No): No
New/changed network calls? (Yes/No): No
Command/tool execution surface changed? (Yes/No): No
Data access scope changed? (Yes/No): No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: macOS
Runtime/container: Node 22 / pnpm worktree
Model/provider: N/A
Integration/channel (if any): TUI + gateway chat client
Relevant config (redacted): default TUI config

Steps

Send a local prompt, then trigger a session reload/reconnect or history rebuild path.
Exercise error/final cleanup paths for the active run.
Render long links/paths in the TUI transcript.

Expected

Pending local sends remain visible and reconcile cleanly.
Busy/error state returns to a sensible status.
Long terminal-style text keeps more horizontal width.

Actual

Before this change, pending sends could disappear and later reappear, busy/error handling could become misleading, and transcript padding wasted horizontal space.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios: focused TUI tests passed, pnpm build passed, and the rebased branch was run locally for interactive TUI validation.
Edge cases checked: pending-send reconciliation on history rebuild, no-active-run Esc/abort handling, and error/final cleanup paths.
What you did not verify: full Pi-style queued steer/follow-up runtime semantics; that remains follow-up work.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes/No): Yes
Config/env changes? (Yes/No): No
Migration needed? (Yes/No): No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: TUI state handling remains easy to regress because optimistic sends, history replay, and run lifecycle events are loosely coupled.
- Mitigation: added focused tests around chat-log reconciliation, editor handling, and event/session cleanup.
Risk: full Pi-style queue/runtime semantics are still not present.
- Mitigation: call that scope boundary out explicitly rather than implying this PR solves the larger parity effort.

Changed files

CHANGELOG.md (modified, +1/-0)
src/tui/components/assistant-message.ts (modified, +1/-1)
src/tui/components/chat-log.test.ts (modified, +57/-0)
src/tui/components/chat-log.ts (modified, +104/-1)
src/tui/components/custom-editor.test.ts (added, +32/-0)
src/tui/components/custom-editor.ts (modified, +5/-0)
src/tui/components/markdown-message.ts (modified, +5/-4)
src/tui/components/pending-messages.test.ts (added, +25/-0)
src/tui/components/pending-messages.ts (added, +35/-0)
src/tui/tui-types.ts (modified, +9/-0)

PR #59819: fix(tui): preserve fallback display after error events

Repository: openclaw/openclaw
Author: joelnishanth
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/59819

Description (problem / solution / changelog)

Summary

Fix: TUI/webchat stops displaying output after a model fallback error event, even when the fallback succeeds in the background (fixes #59570)
Root cause: The handleChatEvent error handler called terminateRun() + maybeRefreshHistoryForRun() which dropped stream assembler state, forgot the local-run flag, and triggered a disruptive loadHistory() call -- preventing fallback delta/final events from rendering
Change: Soft-terminate on error events: drop stale stream assembler state and clear the active-run claim so fallback or new-run events can re-activate the run, but keep the runId in sessionRuns and skip the history refresh

Detailed Analysis

When the gateway retries a model request via runWithModelFallback, it reuses the same runId. The embedded Pi runner emits a lifecycle "error" event when the primary model fails (e.g., 429 rate limit), which the gateway broadcasts as a state: "error" chat event to the TUI.

Previously, the TUI error handler:

Called terminateRun() which dropped stream assembler state, removed the runId from sessionRuns, and cleared activeChatRunId
Called maybeRefreshHistoryForRun() which called forgetLocalRunId() (breaking local-run tracking) and triggered loadHistory() (replacing the chat display with error-state history)

When the fallback model succeeded and emitted new delta/final events with the same runId, the display flow was disrupted.

Now the error handler:

Drops stream assembler state (the primary attempt partial text is irrelevant to the fallback)
Clears activeChatRunId (so fallback events can claim it)
Keeps the runId in sessionRuns (preserving agent event routing)
Skips maybeRefreshHistoryForRun (no disruptive history reload, no local-run flag loss)

Test Plan

New test: "displays fallback output after a chat error event" -- verifies error then delta then final sequence renders correctly
New test: "does not reload history when error event fires for a local run" -- verifies local-run flag is preserved and history is not triggered
All 17 TUI test files pass (159 tests)
pnpm check passes

Joel Nishanth | offlyn.AI

Changed files

src/tui/tui-event-handlers.test.ts (modified, +73/-0)
src/tui/tui-event-handlers.ts (modified, +11/-2)

PR #60043: fix: don't broadcast state:error on per-attempt lifecycle errors

Repository: openclaw/openclaw
Author: jwchmodx
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/60043

Description (problem / solution / changelog)

Summary

When the embedded PI model fails (stopReason === "error"), handleAgentEnd emits a lifecycle phase: "error" per attempt — before any fallback model is tried
server-chat.ts was translating this into a state: "error" chat event immediately, causing the TUI to call terminateRun + maybeRefreshHistoryForRun → chatLog.clearAll()
If a fallback model then succeeded, the TUI had already wiped the chat log via the async history reload, so the fallback's output was lost

Fix: on lifecycle phase: "error", only reset the streaming buffer so the next attempt starts clean. Don't broadcast state: "error" and don't clear the run context. The terminal state: "error" is already sent by broadcastChatError in chat.ts when the entire command fails — that's the correct signal for the TUI to terminate the run.

The run context is kept alive across per-attempt errors so fallback attempts can still resolve the session key. clearAgentRunContext is deferred to agent-command.ts's finally block, which runs after all attempts complete.

Fixes #59570

Test plan

New test: does not emit state:error on per-attempt lifecycle error — fallback can still succeed — verifies primary error + fallback success produces only delta/final, no error event, and the final payload carries the fallback's text
New test: preserves run context across per-attempt errors so fallback events resolve session key — verifies the run context survives a per-attempt error
All existing server-chat.agent-events.test.ts tests pass (31 total)
All TUI tests pass (217 total)

🤖 Generated with Claude Code

Changed files

src/gateway/chat-sanitize.test.ts (modified, +26/-0)
src/gateway/chat-sanitize.ts (modified, +4/-3)
src/gateway/server-chat.agent-events.test.ts (modified, +98/-0)
src/gateway/server-chat.ts (modified, +23/-4)
src/shared/text/assistant-visible-text.ts (modified, +1/-1)
ui/src/ui/chat/message-extract.test.ts (modified, +21/-0)
ui/src/ui/chat/message-extract.ts (modified, +2/-1)

Code Example



---

// Current behavior - terminates immediately
if (evt.state === "error") {
  terminateRun({ status: "error" });  // ← drops state
}

// Suggested fix - wait for fallback
if (evt.state === "error") {
  // Don't terminate
  // Show friendly message
  chatLog.addSystem("API overloaded, trying fallback model...");
  // ← continue waiting for fallback result
}

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

Steps to reproduce

Configure multiple API keys (with rotation enabled)
Configure fallback models
Send a message that triggers rate limit (API returns 429)
Observe TUI displays run error: API rate limit reached
Wait for fallback to complete successfully
Result: TUI does not display the fallback output; manual page refresh required

Expected behavior

TUI should continue displaying the fallback output, or at least show a message like "Trying fallback model..." while waiting.

Actual behavior

TUI shows error and immediately stops output. Gateway logs show fallback actually succeeded in the background, but TUI does not display the result. User can only see the latest execution result by clicking refresh on the web page.

OpenClaw version

2026.4.2

Operating system

WSL2 (Linux 6.6.87.2-microsoft-standard-WSL2) on Windows 11

Install method

npm global

Model

minimax (primary), qwen3-235b and glm-4.7 (fallbacks)

Provider / routing chain

openclaw -> custom provider -> minimax2.5 (error) -> custom provider -> quwen3-235b

Additional provider/model setup details

Configured multiple API keys for rotation
Configured fallback models: qwen3-235b, glm-4.7
Using webchat channel
Primary model: custom-provider/minimax-m25

Logs, screenshots, and evidence

Impact and severity

Affected: Users using TUI/webchat with fallback or multiple API keys configured
Severity: Medium (impacts user experience, requires manual refresh)
Frequency: Occurs every time rate limit is encountered
Consequence: User cannot see fallback success result, needs additional operation

Additional information

This is a behavior bug, not a crash. Root cause analysis:

When TUI receives error event, it calls terminateRun() -> streamAssembler.drop(runId)
This discards all state for that run
Even though model fallback continues executing in background, TUI no longer displays output
User must manually refresh to see results

Suggested fix: Modify TUI to not immediately terminate the run on error. Instead, mark it as "waiting for fallback" and continue displaying output when fallback succeeds.

Code location: src/tui/tui-event-handlers.ts line ~315

// Current behavior - terminates immediately
if (evt.state === "error") {
  terminateRun({ status: "error" });  // ← drops state
}

// Suggested fix - wait for fallback
if (evt.state === "error") {
  // Don't terminate
  // Show friendly message
  chatLog.addSystem("API overloaded, trying fallback model...");
  // ← continue waiting for fallback result
}

extent analysis

TL;DR

Modify the TUI event handler to not terminate the run immediately on error, instead marking it as "waiting for fallback" and continuing to display output when the fallback succeeds.

Guidance

Review the src/tui/tui-event-handlers.ts file, specifically around line 315, to understand the current termination behavior on error events.
Consider implementing a "waiting for fallback" state to handle error events without dropping the run state, allowing the TUI to continue displaying output when the fallback model succeeds.
Verify the fix by reproducing the steps to trigger the rate limit error and checking if the TUI now displays the fallback output without requiring a manual page refresh.
Ensure that the fallback models (qwen3-235b, glm-4.7) are correctly configured and that the API key rotation is properly set up to test the fix thoroughly.

Example

if (evt.state === "error") {
  chatLog.addSystem("API overloaded, trying fallback model...");
  // Implement logic to wait for fallback result and display output
  // without terminating the run
}

Notes

The suggested fix assumes that the fallback models are correctly configured and that the API key rotation is properly set up. It's essential to verify that these conditions are met to ensure the fix works as expected.

Recommendation

Apply the workaround by modifying the TUI event handler to wait for fallback results instead of terminating the run immediately on error, as this approach directly addresses the reported behavior bug and improves the user experience.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

TUI should continue displaying the fallback output, or at least show a message like "Trying fallback model..." while waiting.

#api #API rate limit #task chaining #parallel task #integration issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: TUI stops displaying output after fallback succeeds [4 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #59582: fix(tui): keep active run alive across fallback error events

Description (problem / solution / changelog)

Summary

Problem

Fix

Testing

Changed files

PR #59800: fix(tui): preserve pending sends and busy-state visibility

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause / Regression History (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

PR #59819: fix(tui): preserve fallback display after error events

Description (problem / solution / changelog)

Summary

Detailed Analysis

Test Plan

Changed files

PR #60043: fix: don't broadcast state:error on per-attempt lifecycle errors

Description (problem / solution / changelog)

Summary

Test plan

Changed files

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING