openclaw - 💡(How to fix) Fix In-place update can leave Gateway/UI/Codex in mixed-version state and make recovery difficult

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

After running an OpenClaw update from the Control UI on macOS, the local install entered a mixed-version / partially restarted state. Recovery was difficult because several independent-looking failures appeared together:

  • update.run restart was coalesced as already in-flight, with no useful visible UI feedback.
  • Gateway shutdown hit a missing built asset from the global install path: ERR_MODULE_NOT_FOUND: .../dist/hook-runner-global-BaH8wNFP.js.
  • The browser Control UI repeatedly failed the Gateway handshake with protocol mismatch, first as vcontrol-ui, then as stale v2026.5.5.
  • Codex-backed sessions failed with Requested agent harness "codex" is not registered.
  • Telegram ingress exited during the same recovery window.

Some individual symptoms match issues that were later fixed or tracked separately, but the recovery problem was the combination: the updater left the operator with a running but inconsistent system, where UI, Gateway, installed files, and Codex plugin/runtime state did not agree.

Error Message

2026-05-15T18:18:13.344+03:00 [gateway] shutdown error: Error [ERR_MODULE_NOT_FOUND]: Cannot find module '/opt/homebrew/lib/node_modules/openclaw/dist/hook-runner-global-BaH8wNFP.js' imported from /opt/homebrew/lib/node_modules/openclaw/dist/server.impl-CxmCq_pF.js

Root Cause

After running an OpenClaw update from the Control UI on macOS, the local install entered a mixed-version / partially restarted state. Recovery was difficult because several independent-looking failures appeared together:

Code Example

2026-05-15T18:16:39.737+03:00 [restart] request coalesced (already in-flight) reason=update.run actor=openclaw-control-ui
2026-05-15T18:16:39.738+03:00 [gateway] update.run restart coalesced actor=openclaw-control-ui ... delayMs=0

---

2026-05-15T18:18:13.344+03:00 [gateway] shutdown error: Error [ERR_MODULE_NOT_FOUND]: Cannot find module '/opt/homebrew/lib/node_modules/openclaw/dist/hook-runner-global-BaH8wNFP.js' imported from /opt/homebrew/lib/node_modules/openclaw/dist/server.impl-CxmCq_pF.js

---

2026-05-15T18:18:16.657+03:00 [ws] protocol mismatch ... client=openclaw-control-ui webchat vcontrol-ui
2026-05-15T18:18:16.659+03:00 [ws] closed before connect ... code=1002 reason=protocol mismatch
...
2026-05-15T18:22:20.204+03:00 [ws] protocol mismatch ... client=openclaw-control-ui webchat v2026.5.5
2026-05-15T18:22:20.210+03:00 [ws] closed before connect ... code=1002 reason=protocol mismatch

---

2026-05-15T18:18:43.634+03:00 [diagnostic] lane task error: lane=main durationMs=129 error="Error: Requested agent harness "codex" is not registered."
2026-05-15T18:18:43.642+03:00 Embedded agent failed before reply: Requested agent harness "codex" is not registered.
2026-05-15T18:18:43.660+03:00 [telegram] [default] channel exited: Telegram ingress worker exited with code 1

---

2026-05-15T18:18:43.521+03:00 [plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: brave (/Users/.../.openclaw/npm/node_modules/@openclaw/brave-plugin/dist/index.js). Set plugins.allow to explicit trusted ids.

---

[plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: brave (...), codex (...). Set plugins.allow to explicit trusted ids.
RAW_BUFFERClick to expand / collapse

In-place update can leave Gateway/UI/Codex in mixed-version state and make recovery difficult

Summary

After running an OpenClaw update from the Control UI on macOS, the local install entered a mixed-version / partially restarted state. Recovery was difficult because several independent-looking failures appeared together:

  • update.run restart was coalesced as already in-flight, with no useful visible UI feedback.
  • Gateway shutdown hit a missing built asset from the global install path: ERR_MODULE_NOT_FOUND: .../dist/hook-runner-global-BaH8wNFP.js.
  • The browser Control UI repeatedly failed the Gateway handshake with protocol mismatch, first as vcontrol-ui, then as stale v2026.5.5.
  • Codex-backed sessions failed with Requested agent harness "codex" is not registered.
  • Telegram ingress exited during the same recovery window.

Some individual symptoms match issues that were later fixed or tracked separately, but the recovery problem was the combination: the updater left the operator with a running but inconsistent system, where UI, Gateway, installed files, and Codex plugin/runtime state did not agree.

Environment

  • OS: macOS 26.3 arm64
  • Install: global CLI under Homebrew path
  • Gateway: macOS LaunchAgent, loopback 127.0.0.1:18789
  • Current recovered version: OpenClaw 2026.5.18 (50a2481)
  • Current Node after recovery: v25.9.0
  • Evidence window: 2026-05-15 18:16-18:23 +03:00
  • Existing plugin versions after recovery:

Timeline / evidence

The first sign was an already in-flight update restart:

2026-05-15T18:16:39.737+03:00 [restart] request coalesced (already in-flight) reason=update.run actor=openclaw-control-ui
2026-05-15T18:16:39.738+03:00 [gateway] update.run restart coalesced actor=openclaw-control-ui ... delayMs=0

During shutdown/restart, the gateway referenced a built file that was not present in the global install:

2026-05-15T18:18:13.344+03:00 [gateway] shutdown error: Error [ERR_MODULE_NOT_FOUND]: Cannot find module '/opt/homebrew/lib/node_modules/openclaw/dist/hook-runner-global-BaH8wNFP.js' imported from /opt/homebrew/lib/node_modules/openclaw/dist/server.impl-CxmCq_pF.js

After that, the browser Control UI could not connect because the client/server protocol versions were out of sync:

2026-05-15T18:18:16.657+03:00 [ws] protocol mismatch ... client=openclaw-control-ui webchat vcontrol-ui
2026-05-15T18:18:16.659+03:00 [ws] closed before connect ... code=1002 reason=protocol mismatch
...
2026-05-15T18:22:20.204+03:00 [ws] protocol mismatch ... client=openclaw-control-ui webchat v2026.5.5
2026-05-15T18:22:20.210+03:00 [ws] closed before connect ... code=1002 reason=protocol mismatch

At the same time, Codex-backed agent work failed because the codex harness was unavailable:

2026-05-15T18:18:43.634+03:00 [diagnostic] lane task error: lane=main durationMs=129 error="Error: Requested agent harness "codex" is not registered."
2026-05-15T18:18:43.642+03:00 Embedded agent failed before reply: Requested agent harness "codex" is not registered.
2026-05-15T18:18:43.660+03:00 [telegram] [default] channel exited: Telegram ingress worker exited with code 1

The Gateway also reported only Brave as discovered at first, not Codex:

2026-05-15T18:18:43.521+03:00 [plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: brave (/Users/.../.openclaw/npm/node_modules/@openclaw/brave-plugin/dist/index.js). Set plugins.allow to explicit trusted ids.

After later recovery, current logs show both Brave and Codex discovered:

[plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: brave (...), codex (...). Set plugins.allow to explicit trusted ids.

Why recovery was hard

The symptoms pointed in different directions:

  1. update.run said a restart was already in-flight, but the operator UI did not clearly explain whether to wait, refresh, retry, or run a CLI repair. This overlaps with #78481.
  2. The missing dist/hook-runner-global-*.js error looked like a partially replaced global package or stale process importing chunks from a different build.
  3. The protocol mismatch made the Control UI unusable, removing the obvious recovery surface. This resembles the class fixed in #82882, but here it happened during an update/recovery window with stale browser/UI assets.
  4. The Codex harness error made agent sessions fail even when the Gateway process was alive. This overlaps with the Codex plugin/runtime migration class fixed in #82368/#82502.
  5. Because these failures appeared together, it was unclear whether the right fix was reload UI cache, restart LaunchAgent, reinstall/update OpenClaw, run doctor --fix, repair plugins, or edit config.

Expected behavior

The update path should avoid leaving the system in a mixed-version state where:

  • the live process imports old chunk names from a partly replaced package,
  • browser UI assets advertise a protocol unsupported by the running Gateway,
  • existing/new sessions request a harness whose owner plugin is not registered,
  • and the only visible update state is restart coalesced.

If an update is already in-flight or a restart is pending, the UI/CLI should provide a clear recovery instruction and health check, e.g. "restart pending; wait/reload" vs "update failed; run doctor/reinstall".

Actual behavior

The system became partially usable but not operator-recoverable from the UI:

  • Control UI websocket handshakes were rejected with protocol mismatch.
  • Codex-backed turns failed before reply.
  • Telegram ingress exited.
  • Logs contained a missing built module from the installed package path.
  • Manual log inspection and later repair/update were needed to get back to a clean state.

Related issues

This report is intentionally about the combined update/recovery failure mode, not a request to reopen every individual symptom:

  • #78481: Update button silently no-ops when update.run is coalesced
  • #82368 / #82502: Codex plugin/harness not registered after update/config migration
  • #82882 / #82883: Control UI protocol mismatch after Gateway protocol bump
  • #78539: Prior startup-loop recovery issue after update with unavailable provider plugin

Suggested fixes

  1. Make the update/install step more atomic for global installs, or ensure the old process never imports newly missing hashed dist/* chunks during shutdown.
  2. Surface restart.coalesced in Control UI as an actionable state, with a timeout/failure path.
  3. After update, run a post-restart compatibility probe that checks:
    • Gateway protocol vs served UI protocol,
    • required harness plugins for configured/session runtime routes,
    • plugin package versions vs host version,
    • missing built artifacts referenced by the running process.
  4. Add an explicit "update recovery" command or doctor mode that validates mixed-version state and suggests the narrowest repair.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The update path should avoid leaving the system in a mixed-version state where:

  • the live process imports old chunk names from a partly replaced package,
  • browser UI assets advertise a protocol unsupported by the running Gateway,
  • existing/new sessions request a harness whose owner plugin is not registered,
  • and the only visible update state is restart coalesced.

If an update is already in-flight or a restart is pending, the UI/CLI should provide a clear recovery instruction and health check, e.g. "restart pending; wait/reload" vs "update failed; run doctor/reinstall".

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix In-place update can leave Gateway/UI/Codex in mixed-version state and make recovery difficult