openclaw - 💡(How to fix) Fix update.run can report success after package swap even when gateway restart is ignored [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78110Fetched 2026-05-06 06:16:50
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
2
Timeline (top)
mentioned ×2subscribed ×2closed ×1commented ×1

Control UI / Gateway update.run can report success after replacing the installed OpenClaw package, but fail to restart the running gateway. This leaves the old Node process alive while dist/ files on disk have been replaced by the new package, causing runtime ERR_MODULE_NOT_FOUND / plugin SDK incompatibility errors until the operator manually restarts the service.

This appears to be a core update lifecycle issue: package swap and restart acknowledgment are not atomic, and update.run status does not reflect whether the new gateway process actually came up.

Error Message

Environment, sanitized:

Root Cause

If the package update completed but restart authorization fails, the safest behavior is probably to stop accepting normal requests and surface a maintenance/error state, because the process is now running against an install tree that may no longer match loaded code.

Code Example

17:22:27 [gateway] request handler failed:
  Error: Cannot find module '$OPENCLAW_ROOT/dist/runtime-provider-<hash>.js'
  imported from $OPENCLAW_ROOT/dist/extensions/memory-core/index.js
  code=ERR_MODULE_NOT_FOUND

17:22:48 [ws] webchat connected ... client=openclaw-control-ui

17:23:00 [gateway] update.run completed actor=openclaw-control-ui ... restartReason=update.run status=ok
17:23:00 [ws] res ✓ update.run 174080ms ...
17:23:00 [gateway] signal SIGUSR1 received
17:23:00 [gateway] SIGUSR1 restart ignored (not authorized; commands.restart=false or use gateway tool).

17:25:37 [gateway] request handler failed:
  Error: Cannot find module '$OPENCLAW_ROOT/dist/runtime-provider-<hash>.js'
  imported from $OPENCLAW_ROOT/dist/extensions/memory-core/index.js
  code=ERR_MODULE_NOT_FOUND

17:27:04 [discord] message run failed:
  SyntaxError: The requested module 'openclaw/plugin-sdk/channel-streaming'
  does not provide an export named 'formatChannelProgressDraftLineForEntry'

17:27:35 systemd: Stopping openclaw-gateway.service - OpenClaw Gateway (v2026.5.3-1)...
17:27:35 [gateway] signal SIGTERM received
17:27:35 [gateway] shutdown error:
  Error [ERR_MODULE_NOT_FOUND]: Cannot find module '$OPENCLAW_ROOT/dist/server-close-<hash>.js'
  imported from $OPENCLAW_ROOT/dist/server.impl-<hash>.js
17:27:35 systemd: Started openclaw-gateway.service - OpenClaw Gateway (v2026.5.3-1).
17:27:40 [gateway] http server listening (... plugins ...)

---

SIGUSR1 restart ignored (not authorized; commands.restart=false or use gateway tool).
RAW_BUFFERClick to expand / collapse

Summary

Control UI / Gateway update.run can report success after replacing the installed OpenClaw package, but fail to restart the running gateway. This leaves the old Node process alive while dist/ files on disk have been replaced by the new package, causing runtime ERR_MODULE_NOT_FOUND / plugin SDK incompatibility errors until the operator manually restarts the service.

This appears to be a core update lifecycle issue: package swap and restart acknowledgment are not atomic, and update.run status does not reflect whether the new gateway process actually came up.

Observed behavior

Environment, sanitized:

  • Install kind: global/package install under $HOME/.npm-global/lib/node_modules/openclaw
  • Service: user systemd openclaw-gateway.service
  • Gateway invoked as node .../openclaw/dist/index.js gateway --port <port>
  • Core before: 2026.5.3-1
  • Core target/after: 2026.5.4
  • External Discord plugin was also updated to 2026.5.4
  • Control UI client: openclaw-control-ui

Timeline, sanitized:

17:22:27 [gateway] request handler failed:
  Error: Cannot find module '$OPENCLAW_ROOT/dist/runtime-provider-<hash>.js'
  imported from $OPENCLAW_ROOT/dist/extensions/memory-core/index.js
  code=ERR_MODULE_NOT_FOUND

17:22:48 [ws] webchat connected ... client=openclaw-control-ui

17:23:00 [gateway] update.run completed actor=openclaw-control-ui ... restartReason=update.run status=ok
17:23:00 [ws] res ✓ update.run 174080ms ...
17:23:00 [gateway] signal SIGUSR1 received
17:23:00 [gateway] SIGUSR1 restart ignored (not authorized; commands.restart=false or use gateway tool).

17:25:37 [gateway] request handler failed:
  Error: Cannot find module '$OPENCLAW_ROOT/dist/runtime-provider-<hash>.js'
  imported from $OPENCLAW_ROOT/dist/extensions/memory-core/index.js
  code=ERR_MODULE_NOT_FOUND

17:27:04 [discord] message run failed:
  SyntaxError: The requested module 'openclaw/plugin-sdk/channel-streaming'
  does not provide an export named 'formatChannelProgressDraftLineForEntry'

17:27:35 systemd: Stopping openclaw-gateway.service - OpenClaw Gateway (v2026.5.3-1)...
17:27:35 [gateway] signal SIGTERM received
17:27:35 [gateway] shutdown error:
  Error [ERR_MODULE_NOT_FOUND]: Cannot find module '$OPENCLAW_ROOT/dist/server-close-<hash>.js'
  imported from $OPENCLAW_ROOT/dist/server.impl-<hash>.js
17:27:35 systemd: Started openclaw-gateway.service - OpenClaw Gateway (v2026.5.3-1).
17:27:40 [gateway] http server listening (... plugins ...)

After a manual CLI/service restart, the gateway came up on 2026.5.4 and plugin errors cleared.

Expected behavior

For package-based updates, update.run should not report a completed/successful upgrade unless the restart handoff is guaranteed and/or verified.

At minimum:

  1. If package files were replaced, the old process must not continue serving requests from the replaced install tree.
  2. If restart cannot be authorized/executed, update.run should return a failure or degraded state such as restart_failed, not ok.
  3. Control UI should surface “package updated but restart failed; manual restart required” instead of appearing to complete normally.
  4. The updater should avoid leaving an old process importing from a mutated dist/ directory.

Why this is dangerous

The failure mode is worse than a normal failed update:

  • The package on disk is partially/fully changed.
  • The old process continues running.
  • Lazy/dynamic imports start resolving against the new package tree.
  • Core extensions and plugins can fail unpredictably.
  • Channel handlers may fail, so the user may not receive the assistant's status/error response.

In the observed incident, Discord replies failed until the operator manually ran the CLI update/restart.

Suspected cause

update.run appears to:

  1. Run the package update successfully.
  2. Write/update restart sentinel state.
  3. Schedule/emit a SIGUSR1 restart.
  4. Return status=ok based on package update success.

But the runtime then logged:

SIGUSR1 restart ignored (not authorized; commands.restart=false or use gateway tool).

So the restart request was not actually accepted by the running gateway. The result returned to Control UI did not reflect this restart failure, and the old process stayed alive against the new package files.

Suggested fix

1. Make package update + restart a verified state machine

For package-swap updates, separate statuses explicitly:

  • package_update_failed
  • package_updated_restart_scheduled
  • package_updated_restart_failed
  • package_updated_restart_verified

Control UI should treat only restart_verified as a complete successful upgrade.

2. Use an explicit restart intent for in-process update restarts

Before or immediately after package replacement, write a short-lived restart intent file / token that the SIGUSR1 handler will accept even if normal external SIGUSR1 is disabled.

The SIGUSR1 handler already has an intent path concept; update.run should use that same mechanism or another equivalent internal authorization path so its own restart cannot be rejected as an external/unauthorized SIGUSR1.

3. Verify the new process after restart

For package updates, after scheduling restart:

  • wait for old gateway connection to close, or return a “restart pending” response that Control UI polls;
  • poll the gateway /health/RPC status until reachable;
  • verify version === targetVersion or version !== beforeVersion;
  • if verification times out, show a hard warning with exact manual recovery instructions.

4. Do not continue serving after package swap if restart fails

If the package update completed but restart authorization fails, the safest behavior is probably to stop accepting normal requests and surface a maintenance/error state, because the process is now running against an install tree that may no longer match loaded code.

5. Add regression coverage

Suggested tests:

  • update.run package mode with restart disabled or unauthorized SIGUSR1 should not return ok as a completed upgrade.
  • update.run should include restart scheduling/verification status in its response.
  • If SIGUSR1 is ignored, the update sentinel/control-plane response should show restart failure/manual restart required.
  • Control UI should display incomplete update state when package update succeeds but restart verification fails.

Additional note

This can combine badly with external plugin updates. Updating an external plugin to the new release before core has restarted can load plugin code that expects newer openclaw/plugin-sdk exports. The safer update flow is to keep core + bundled/external plugin sync and gateway restart coordinated as one operation, with runtime verification at the end.

extent analysis

TL;DR

The most likely fix involves modifying the update.run process to ensure a verified state machine for package updates and restarts, preventing the old process from continuing to serve requests after a package swap.

Guidance

  • Implement a verified state machine for package updates, with explicit statuses for package_update_failed, package_updated_restart_scheduled, package_updated_restart_failed, and package_updated_restart_verified.
  • Use an explicit restart intent for in-process update restarts, utilizing the SIGUSR1 handler's intent path concept to prevent restart rejection.
  • Verify the new process after restart by polling the gateway's /health/RPC status until reachable and ensuring the version matches the target version.
  • If restart authorization fails, stop accepting normal requests and surface a maintenance/error state to prevent serving against a mutated install tree.

Example

// Pseudocode example of verified state machine
const updateStates = {
  PACKAGE_UPDATE_FAILED: 'package_update_failed',
  PACKAGE_UPDATED_RESTART_SCHEDULED: 'package_updated_restart_scheduled',
  PACKAGE_UPDATED_RESTART_FAILED: 'package_updated_restart_failed',
  PACKAGE_UPDATED_RESTART_VERIFIED: 'package_updated_restart_verified',
};

// Update run function with verified state machine
async function updateRun() {
  try {
    // Package update logic
    const packageUpdateResult = await updatePackage();
    if (packageUpdateResult.success) {
      // Schedule restart
      const restartIntent = createRestartIntent();
      const restartResult = await scheduleRestart(restartIntent);
      if (restartResult.success) {
        // Verify new process
        const verificationResult = await verifyNewProcess();
        if (verificationResult.success) {
          return { status: updateStates.PACKAGE_UPDATED_RESTART_VERIFIED };
        } else {
          return { status: updateStates.PACKAGE_UPDATED_RESTART_FAILED };
        }
      } else {
        return { status: updateStates.PACKAGE_UPDATED_RESTART_FAILED };

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

For package-based updates, update.run should not report a completed/successful upgrade unless the restart handoff is guaranteed and/or verified.

At minimum:

  1. If package files were replaced, the old process must not continue serving requests from the replaced install tree.
  2. If restart cannot be authorized/executed, update.run should return a failure or degraded state such as restart_failed, not ok.
  3. Control UI should surface “package updated but restart failed; manual restart required” instead of appearing to complete normally.
  4. The updater should avoid leaving an old process importing from a mutated dist/ directory.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix update.run can report success after package swap even when gateway restart is ignored [1 comments, 2 participants]