openclaw - ✅(Solved) Fix [Bug]: TUI/webchat can stay in pondering after codex/gpt-5.4 has already finished the turn [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#66470Fetched 2026-04-15 06:26:06
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×2commented ×1cross-referenced ×1mentioned ×1

With codex/gpt-5.4 on OpenClaw 2026.4.12, the backend can finish and write the assistant reply, but TUI/webchat can remain in pondering for tens of seconds before rendering the final response.

Error Message

Observed evidence from local testing:

  • Gateway startup log: startup model warmup failed for codex/gpt-5.4: Error: Unknown model: codex/gpt-5.4
  • In a live Codex TUI test, the backend had already written the assistant reply to the session file, but the frontend still showed pondering for about 39 seconds.

Relevant local session file:

  • ~/.openclaw/agents/main/sessions/sessionID.jsonl

Relevant local gateway journal:

  • journalctl --user -u openclaw-gateway

Root Cause

With codex/gpt-5.4 on OpenClaw 2026.4.12, the backend can finish and write the assistant reply, but TUI/webchat can remain in pondering for tens of seconds before rendering the final response.

Fix Action

Fix / Workaround

Affected: TUI/webchat users running codex/gpt-5.4 on OpenClaw 2026.4.12 Severity: High Frequency: Reproduced across multiple fresh Codex sessions before local patching Consequence: Interactive Codex sessions appear hung or much slower than the backend completion time

A local runtime patch that registered the started agent run id in chat.send removed the delayed terminal event behavior in my install. A separate local patch that stopped skipping provider runtime hooks removed the Unknown model: codex/gpt-5.4 warmup log on startup

PR fix notes

PR #66594: fix(gateway): register chat run on agent start and enable provider hooks during model warmup

Description (problem / solution / changelog)

Summary

  • Problem: With codex/gpt-5.4, the TUI/webchat remains in pondering for ~39s after the backend has already written the assistant reply. Additionally, gateway startup logs Unknown model: codex/gpt-5.4 during warmup.
  • Why it matters: Interactive Codex sessions appear hung or much slower than actual completion time.
  • What changed:
    1. chat.send now calls context.addChatRun() inside onAgentRunStartagent.ts already did this, chat.send did not.
    2. server-startup-post-attach.ts sets skipProviderRuntimeHooks: false so provider-registered models (like codex) resolve correctly at boot.
    3. Test mock in chat.directive-tags.test.ts updated to expose addChatRun: vi.fn() — without it, onAgentRunStart threw a TypeError before reaching registerToolEventRecipient and the audio dispatch loop.

Closes #66470 Supersedes #66337

Root Cause

  1. chat.send never called addChatRun() to register the agent run ID. Without this registration, finalizeLifecycleEvent cannot map the agent's internal run ID back to the client-facing idempotency key → missing chat.final events → TUI hangs until timeout (~39s).
  2. prewarmConfiguredPrimaryModel passed skipProviderRuntimeHooks: true, skipping hooks that register dynamic model IDs → Unknown model warning at startup.

Changes

FileLineChange
chat.ts2153Added context.addChatRun(runId, { sessionKey, clientRunId: runId }) in onAgentRunStart
server-startup-post-attach.tsSet skipProviderRuntimeHooks: false
chat.directive-tags.test.ts286, 300Added addChatRun to Pick<> type and addChatRun: vi.fn() to the mock

Diagram

Before: [chat.send] → [agent writes reply] → [TUI waits for final event] → [timeout ~39s] → [render] After: [chat.send] → [onAgentRunStart: addChatRun()] → [agent writes reply] → [TUI receives final event] → [render immediately]

Security Impact

None. No new permissions, no secrets handling changes, no new network calls, no command execution surface changes.

Tests

49/49 passing in chat.directive-tags.test.ts. 6/6 startup tests passing.

Changed files

  • src/gateway/server-methods/chat.directive-tags.test.ts (modified, +2/-0)
  • src/gateway/server-methods/chat.ts (modified, +4/-0)
  • src/gateway/server-startup-post-attach.ts (modified, +1/-1)
  • src/gateway/server-startup.test.ts (modified, +1/-1)

PR #66337: fix(gateway): use provider runtime hooks in startup warmup

Description (problem / solution / changelog)

Summary

  • Problem: gateway startup warmup forced resolveModel(..., { skipProviderRuntimeHooks: true }), so provider-owned dynamic models such as codex/gpt-5.4 were misreported as Unknown model during startup.
  • Why it matters: the warning suggests the model is unsupported even when the normal runtime path resolves and uses it successfully.
  • What changed: startup warmup now uses the normal resolveModel(...) path, removes the stale static-only error text, and adds regression coverage for provider-runtime-resolved Codex models.
  • What did NOT change (scope boundary): this PR does not claim to fix broader model catalog/listing or harness-activation issues tracked elsewhere.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #64938
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: prewarmConfiguredPrimaryModel() in src/gateway/server-startup-post-attach.ts forced the startup warmup through resolveModel(..., { skipProviderRuntimeHooks: true }), which swaps in STATIC_PROVIDER_RUNTIME_HOOKS and disables runProviderDynamicModel. Provider-owned dynamic models like codex/* depend on that hook to resolve.
  • Missing detection / guardrail: src/gateway/server-startup.test.ts asserted the static-only warmup behavior instead of locking in parity with the normal runtime resolution path.
  • Contributing context (if known): the old error text explicitly documented warmup as “static model resolution” only, which baked the degraded behavior into startup diagnostics.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-startup.test.ts
  • Scenario the test should lock in: startup warmup should not warn when a configured primary model is resolved via provider runtime hooks, using codex/gpt-5.4 as the regression case.
  • Why this is the smallest reliable guardrail: the bug is isolated to the startup warmup call-site and can be reproduced by observing whether it passes the skip flag into resolveModel.
  • Existing test that already covers this (if any): none
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • Gateway startup no longer emits a false Unknown model: codex/gpt-5.4 warning when the model is provider-runtime-resolved.

Diagram (if applicable)

Before:
[startup warmup] -> resolveModel(skipProviderRuntimeHooks=true) -> static-only resolution -> false Unknown model warning

After:
[startup warmup] -> resolveModel() -> provider runtime resolution -> no false warning

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: Node 22 / pnpm workspace checkout
  • Model/provider: codex/gpt-5.4
  • Integration/channel (if any): Gateway startup warmup
  • Relevant config (redacted): agents.defaults.model.primary = "codex/gpt-5.4"

Steps

  1. Configure agents.defaults.model.primary to codex/gpt-5.4.
  2. Start the gateway and let startup warmup run.
  3. Observe whether startup emits startup model warmup failed for codex/gpt-5.4: Error: Unknown model: codex/gpt-5.4.

Expected

  • Startup warmup should resolve the configured primary model the same way the normal runtime path does and should not emit a false Unknown model warning for provider-runtime-resolved Codex models.

Actual

  • Before this patch, startup warmup forced static-only resolution and emitted the false warning.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios:
    • pnpm test src/gateway/server-startup.test.ts
    • pnpm test src/gateway/server-startup-post-attach.test.ts
    • pnpm test extensions/codex/provider.test.ts -t "resolves arbitrary Codex app-server model ids through the codex provider"
    • pnpm build
  • Edge cases checked:
    • explicit openai-codex/gpt-5.4 warmup still prewarms successfully
    • provider-runtime-resolved codex/gpt-5.4 no longer warns in the startup unit test path
  • What you did not verify:
    • live gateway startup logs on a real Codex-configured install in this branch
    • repo-wide pnpm check is currently red on unrelated pre-existing tsgo failures outside the touched surface

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: startup warmup now allows provider runtime hooks for the configured primary model.
    • Mitigation: the warmup path still uses synchronous resolveModel() (not resolveModelAsync()), so this change re-enables provider runtime model resolution without introducing prepareProviderDynamicModel() async prefetch behavior; regression coverage locks the intended call path.

Changed files

  • src/gateway/server-startup-post-attach.ts (modified, +2/-7)
  • src/gateway/server-startup.test.ts (modified, +50/-4)

Code Example

Observed evidence from local testing:

- Gateway startup log: `startup model warmup failed for codex/gpt-5.4: Error: Unknown model: codex/gpt-5.4`
- In a live Codex TUI test, the backend had already written the assistant reply to the session file, but the frontend still showed `pondering` for about 39 seconds.

Relevant local session file:
- `~/.openclaw/agents/main/sessions/sessionID.jsonl`

Relevant local gateway journal:
- `journalctl --user -u openclaw-gateway`
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

With codex/gpt-5.4 on OpenClaw 2026.4.12, the backend can finish and write the assistant reply, but TUI/webchat can remain in pondering for tens of seconds before rendering the final response.

Steps to reproduce

  1. Start OpenClaw 2026.4.12 with codex/gpt-5.4 as the default model.
  2. Open a fresh TUI or webchat session.
  3. Send a trivial prompt such as testing, wait for the first reply, then send a second trivial prompt.
  4. Observe that the UI can remain in pondering for about 39 seconds after the backend has already appended the assistant reply to the session file.

Expected behavior

When the backend finishes a Codex turn, the TUI/webchat should render the final reply immediately instead of continuing to show pondering.

Actual behavior

In live PTY testing, the assistant reply was already written to the session file, but the TUI continued showing pondering for about 39 seconds before I interrupted it. On the same install, gateway startup also logged startup model warmup failed for codex/gpt-5.4: Error: Unknown model: codex/gpt-5.4.

OpenClaw version

2026.4.12

Operating system

Ubuntu 24.04.4 LTS on WSL

Install method

npm global

Model

codex/gpt-5.4

Provider / routing chain

openclaw -> codex app-server -> codex/gpt-5.4

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Observed evidence from local testing:

- Gateway startup log: `startup model warmup failed for codex/gpt-5.4: Error: Unknown model: codex/gpt-5.4`
- In a live Codex TUI test, the backend had already written the assistant reply to the session file, but the frontend still showed `pondering` for about 39 seconds.

Relevant local session file:
- `~/.openclaw/agents/main/sessions/sessionID.jsonl`

Relevant local gateway journal:
- `journalctl --user -u openclaw-gateway`

Impact and severity

Affected: TUI/webchat users running codex/gpt-5.4 on OpenClaw 2026.4.12 Severity: High Frequency: Reproduced across multiple fresh Codex sessions before local patching Consequence: Interactive Codex sessions appear hung or much slower than the backend completion time

Additional information

A local runtime patch that registered the started agent run id in chat.send removed the delayed terminal event behavior in my install. A separate local patch that stopped skipping provider runtime hooks removed the Unknown model: codex/gpt-5.4 warmup log on startup

extent analysis

TL;DR

The issue can likely be fixed by registering the started agent run ID in the chat.send method and ensuring provider runtime hooks are not skipped.

Guidance

  • Verify that the chat.send method is correctly registering the started agent run ID to prevent delayed terminal events.
  • Check the provider runtime hooks to ensure they are not being skipped, which may cause the Unknown model: codex/gpt-5.4 warmup log error.
  • Review the local runtime patches applied to remove the delayed terminal event behavior and the Unknown model log error to understand the changes made.
  • Investigate the session file (~/.openclaw/agents/main/sessions/sessionID.jsonl) and gateway journal (journalctl --user -u openclaw-gateway) for any relevant logs or errors.

Example

No code snippet is provided as the issue does not contain sufficient information to create a specific example.

Notes

The issue seems to be related to the interaction between the OpenClaw backend and the Codex model, specifically with the codex/gpt-5.4 model on OpenClaw 2026.4.12. The local runtime patches applied suggest that registering the started agent run ID and ensuring provider runtime hooks are not skipped may resolve the issue.

Recommendation

Apply the workaround by registering the started agent run ID in the chat.send method and ensuring provider runtime hooks are not skipped, as this has been shown to remove the delayed terminal event behavior and the Unknown model log error in local testing.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When the backend finishes a Codex turn, the TUI/webchat should render the final reply immediately instead of continuing to show pondering.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING