openclaw - ✅(Solved) Fix update.run: bundled runtime deps staging timeout + shutdown race condition causes gateway crash on cold cache [1 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72114Fetched 2026-04-27 05:34:40
View on GitHub
Comments
2
Participants
3
Timeline
7
Reactions
6
Timeline (top)
cross-referenced ×3commented ×2closed ×1subscribed ×1

When running OpenClaw auto-update (update.run or equivalent), the gateway may crash and fail to restart properly due to two combined issues with bundled runtime dependency installation during the update process.

Error Message

08:41:18 [plugins] browser staging bundled runtime deps (7 missing, 32 install specs) 08:41:30 [plugins] browser failed to stage bundled runtime deps after 12116ms 08:42:11 [plugins] qqbot failed during register: Error: npm install failed

Root Cause

Two overlapping issues:

Fix Action

Workaround

Manually install npm dependencies separately, then copy the complete node_modules to ~/.openclaw/plugin-runtime-deps/<variant>/node_modules/ before restarting. This bypasses the cold cache and race condition since the deps are already available.

PR fix notes

PR #72084: postinstall: reinstall stale bundled runtime deps after npm updates

Description (problem / solution / changelog)

Summary

  • treat stale bundled runtime dep versions as missing during postinstall eager installs
  • evaluate installed runtime dep versions with the same semver contract used by bundled runtime repair instead of the earlier exact-or-caret-only check
  • restore the optional native-child regression path by writing a real @snazzah/davey version into the test fixture
  • keep the minimal pnpm-lock.yaml importer sync needed for the current extensions/diagnostics-prometheus workspace graph so frozen-lockfile CI reflects the checked-in package graph

Testing

  • pnpm test -- test/scripts/postinstall-bundled-plugins.test.ts -t "installed version is stale|optional native children are missing|installed version satisfies a semver range|sentinel package already exists|installs only missing bundled plugin runtime deps"
  • pnpm exec oxlint scripts/postinstall-bundled-plugins.mjs test/scripts/postinstall-bundled-plugins.test.ts && pnpm exec oxfmt --check scripts/postinstall-bundled-plugins.mjs test/scripts/postinstall-bundled-plugins.test.ts
  • manual stale-version repro through runBundledPluginPostinstall confirmed the reinstall path for an outdated installed dependency version
  • pnpm check currently stops in pre-existing unrelated agents/typebox baseline failures in this worktree

Linked Issue

Closes #72058

Changed files

  • pnpm-lock.yaml (modified, +6/-0)
  • scripts/postinstall-bundled-plugins.mjs (modified, +24/-0)
  • test/scripts/postinstall-bundled-plugins.test.ts (modified, +61/-2)

Code Example

08:41:18 [plugins] browser staging bundled runtime deps (7 missing, 32 install specs)
08:41:30 [plugins] browser failed to stage bundled runtime deps after 12116ms
08:42:11 [plugins] qqbot failed during register: Error: npm install failed

---

08:46:45 [plugins] browser installed bundled runtime deps in 7183ms
08:48:10 [gateway] ready (10 plugins: acpx, bonjour, browser, device-pair, lossless-claw, memory-lancedb-pro, phone-control, qqbot, talk-voice, telegram; 199.0s)
RAW_BUFFERClick to expand / collapse

Description

When running OpenClaw auto-update (update.run or equivalent), the gateway may crash and fail to restart properly due to two combined issues with bundled runtime dependency installation during the update process.

Environment

  • OpenClaw v2026.4.24 (upgrading from v2026.4.23)
  • Node v22.22.0, npm 11.12.1
  • OS: Linux (Tencent Cloud OpenCloudOS 9)
  • Registry: https://registry.npmjs.org/

Symptoms

  1. After update triggers gateway restart, the new gateway process fails to stage bundled runtime deps for the browser plugin with "failed to stage after ~12s"
  2. Gateway crashes, agents lose connection
  3. openclaw doctor cannot resolve the issue
  4. Workaround: manually download latest npm dependencies and copy node_modules into the runtime-deps directory

Root Cause

Two overlapping issues:

Issue A: Staging timeout too aggressive on cold cache (~12s)

When the browser plugin stages its bundled runtime deps for the first time with a cold npm cache, 7 packages need to be downloaded:

The internal staging timeout (~12 seconds) is insufficient when the npm cache is cold. After the cache is warmed, the same install completes in ~7 seconds.

Issue B: Shutdown race condition during update

The update.run process sends SIGTERM to the old gateway process while it is still staging bundled runtime deps (npm install is running). After 30s, systemd sends SIGKILL which cascades to the npm install subprocess, leaving the runtime-deps directory in an inconsistent state.

Timeline

TimeEvent
08:41:18Browser staging starts (7 missing deps)
08:41:30❌ Browser failed to stage after 12,116ms (cold cache)
08:42:11Old process receives SIGTERM
08:42:42systemd SIGKILL → kills npm install @gr process
08:44:50New gateway starts (PID 919826)
08:46:45✅ Browser installed in 7.2s (warm cache, second attempt)
08:48:10✅ Gateway ready (10 plugins)

Log Evidence

From journalctl (first failed attempt, PID 899978):

08:41:18 [plugins] browser staging bundled runtime deps (7 missing, 32 install specs)
08:41:30 [plugins] browser failed to stage bundled runtime deps after 12116ms
08:42:11 [plugins] qqbot failed during register: Error: npm install failed

Second attempt (PID 919826, warm cache):

08:46:45 [plugins] browser installed bundled runtime deps in 7183ms
08:48:10 [gateway] ready (10 plugins: acpx, bonjour, browser, device-pair, lossless-claw, memory-lancedb-pro, phone-control, qqbot, talk-voice, telegram; 199.0s)

Workaround

Manually install npm dependencies separately, then copy the complete node_modules to ~/.openclaw/plugin-runtime-deps/<variant>/node_modules/ before restarting. This bypasses the cold cache and race condition since the deps are already available.

Suggested Fixes

  1. Dynamic staging timeout: Scale with number of missing packages, or add a cold-cache grace period (e.g. 30s instead of 12s for first-boot)
  2. Improve shutdown sequence: Wait for pending npm install to complete before SIGTERM, or increase the SIGTERM-to-SIGKILL grace period

extent analysis

TL;DR

Increase the staging timeout or implement a dynamic timeout to accommodate cold cache scenarios and prevent the gateway from crashing during the update process.

Guidance

  • Verify the current staging timeout value and consider increasing it to at least 30 seconds to account for cold cache scenarios.
  • Implement a dynamic staging timeout that scales with the number of missing packages to prevent timeouts during the update process.
  • Review the shutdown sequence to ensure that pending npm installs are completed before sending SIGTERM to the old gateway process.
  • Consider adding a cold-cache grace period to the staging timeout to improve the update process's reliability.

Example

No code snippet is provided as the issue is related to configuration and process management rather than code.

Notes

The suggested fixes require modifications to the update process and shutdown sequence, which may involve changes to the OpenClaw configuration or code. It is essential to test these changes thoroughly to ensure they do not introduce new issues.

Recommendation

Apply the workaround of manually installing npm dependencies and copying the complete node_modules to the ~/.openclaw/plugin-runtime-deps/<variant>/node_modules/ directory before restarting, as this bypasses the cold cache and race condition issues. This workaround provides a temporary solution until a more permanent fix can be implemented.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix update.run: bundled runtime deps staging timeout + shutdown race condition causes gateway crash on cold cache [1 pull requests, 2 comments, 3 participants]