openclaw - ✅(Solved) Fix Update button no-op on systemd-supervised installs: handoff helper killed by unit cgroup before it runs [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#84068Fetched 2026-05-20 03:44:30
View on GitHub
Comments
1
Participants
2
Timeline
15
Reactions
1
Author
Timeline (top)
referenced ×8labeled ×5commented ×1cross-referenced ×1

In systemd-supervised installs (openclaw-gateway.service running under systemctl --user), clicking Update now in the Control UI silently fails on every attempt. The gateway restarts on the same version, the UI surfaces Update skipped: managed-service-handoff-started, and npm install -g openclaw@latest is never actually invoked.

The handoff helper script (/tmp/openclaw-update-run-handoff-XXXXXX/handoff.cjs) gets created but never executes — its handoff.log never appears on disk, while handoff.cjs is still sitting in /tmp.

Reproduced cleanly upgrading 2026.5.16-beta.7 → 2026.5.18 on Linux/systemd.

Error Message

Plus a sanity check: if handoffId exists in the sentinel but handoff.log was never written within N seconds after gateway start, surface a more diagnostic error like managed-service-handoff-helper-never-ran instead of leaving the in-flight "handoff-started" reason as the final state.

  • The error message in the UI tells the user nothing actionable — doctorHint: "Run: openclaw doctor --non-interactive" doesn't detect the race.

Root Cause

Root cause (my read)

Fix Action

Fix / Workaround

Manual workaround that succeeds in seconds:

npm install -g openclaw@latest && systemctl --user restart openclaw-gateway

PR fix notes

PR #84151: fix(gateway): escape systemd cgroup for update handoff helper

Description (problem / solution / changelog)

Summary

The update button does nothing on systemd-supervised installs because the handoff helper is killed by KillMode=control-group before it can execute.

Root Cause

update-managed-service-handoff.ts:324 spawns the helper with { detached: true }, which creates a new process group but does not escape the systemd unit's cgroup. When the gateway exits for restart, systemd kills all processes in the cgroup — including the helper.

Fix

When the supervisor is systemd, spawn the helper via systemd-run --user --scope to place it in a transient scope unit outside the gateway's cgroup. Non-systemd paths (launchd, Windows, unsupervised) are unchanged. KillMode=control-group is intentionally preserved for normal child process cleanup.

If systemd-run fails (ENOENT, no user bus), the handoff throws immediately rather than falling back to a doomed detached spawn.

Tests

3 new tests: systemd path uses systemd-run, non-systemd uses plain detached spawn, systemd-run failure throws. All existing update tests pass.

Fixes #84068

Changed files

  • src/gateway/server-methods/update-managed-service-handoff.test.ts (modified, +147/-0)
  • src/gateway/server-methods/update-managed-service-handoff.ts (modified, +86/-2)
  • src/gateway/server-methods/update.test.ts (modified, +1/-0)
  • src/gateway/server-methods/update.ts (modified, +1/-0)

Code Example

$ ls -la /tmp/openclaw-update-run-handoff-Aka0m3/
-rwx------ 1 user user 6150 May 19 08:59 handoff.cjs
-rw------- 1 user user  878 May 19 08:59 handoff.json
-rw------- 1 user user  170 May 19 08:59 sentinel-meta.json
# handoff.log is MISSING — helper never ran appendLog() even once.

$ openclaw --version
OpenClaw 2026.5.16-beta.7   # unchanged

---

npm install -g openclaw@latest && systemctl --user restart openclaw-gateway

---

// pseudocode for the supervisor=systemd path
spawn("systemd-run", [
  "--user", "--scope",
  "--unit", `openclaw-update-${handoffId}.scope`,
  "--collect",
  process.execPath, helperPath, paramsPath,
], { detached: true, stdio: "ignore" }).unref();
RAW_BUFFERClick to expand / collapse

Summary

In systemd-supervised installs (openclaw-gateway.service running under systemctl --user), clicking Update now in the Control UI silently fails on every attempt. The gateway restarts on the same version, the UI surfaces Update skipped: managed-service-handoff-started, and npm install -g openclaw@latest is never actually invoked.

The handoff helper script (/tmp/openclaw-update-run-handoff-XXXXXX/handoff.cjs) gets created but never executes — its handoff.log never appears on disk, while handoff.cjs is still sitting in /tmp.

Reproduced cleanly upgrading 2026.5.16-beta.7 → 2026.5.18 on Linux/systemd.

Repro

Environment:

  • OpenClaw 2026.5.16-beta.7 (npm-global install)
  • Node 22.22.2, Linux 6.8.0-111-generic
  • Gateway managed by systemctl --user (openclaw-gateway.service)
  • Cloudflare Tunnel in front of Control UI; no special auth involvement

Steps:

  1. Latest version on latest tag is newer than current binary (2026.5.18 > 2026.5.16-beta.7).
  2. Control UI shows "Update available: v2026.5.18". Click Update now.
  3. Gateway writes:
    • /tmp/openclaw-update-run-handoff-XXXXXX/handoff.cjs (executable)
    • /tmp/openclaw-update-run-handoff-XXXXXX/handoff.json
    • /tmp/openclaw-update-run-handoff-XXXXXX/sentinel-meta.json
    • ~/.openclaw/gateway-supervisor-restart-handoff.json
    • ~/.openclaw/restart-sentinel.json with reason: "managed-service-handoff-started", before.version == after.version
  4. Systemd restarts the unit (per the supervisor-restart-handoff).
  5. New gateway starts on the same version. UI reads the stale sentinel and shows "Update skipped: managed-service-handoff-started".

Observed after the failed click:

$ ls -la /tmp/openclaw-update-run-handoff-Aka0m3/
-rwx------ 1 user user 6150 May 19 08:59 handoff.cjs
-rw------- 1 user user  878 May 19 08:59 handoff.json
-rw------- 1 user user  170 May 19 08:59 sentinel-meta.json
# handoff.log is MISSING — helper never ran appendLog() even once.

$ openclaw --version
OpenClaw 2026.5.16-beta.7   # unchanged

Manual workaround that succeeds in seconds:

npm install -g openclaw@latest && systemctl --user restart openclaw-gateway

Root cause (my read)

Looking at dist/server-methods-CfPYovlX.js (the update-managed-service-handoff region) and handoff.cjs:

  • The gateway uses child_process.spawn with {detached: true} to launch handoff.cjs, intending the helper to outlive the parent.
  • It then triggers a supervisor restart via gateway-supervisor-restart-handoff.json.
  • Under systemctl --user, the gateway and its spawned children all live in the same unit cgroup (openclaw-gateway.service). When systemd restarts the unit, it sends SIGTERM to the entire cgroup — including the freshly-spawned handoff helper, before the helper has reached its parent-exit-wait loop, much less the npm install step.
  • {detached: true} only moves the child into a new process group; it does not escape the systemd cgroup. To survive a unit restart, the helper has to be launched outside the unit's scope — e.g. via systemd-run --user --scope --unit=... or systemctl --user start openclaw-update@<id>.service, not a plain spawn.

Evidence the helper never starts work:

  • handoff.log is never created (the helper's very first action on any code path is appendLog(...)).
  • handoff.cjs is left behind in /tmp (it deletes its own sensitivePaths on completion or failure).
  • Final sentinel shows before.version === after.version === 2026.5.16-beta.7 with reason: "managed-service-handoff-started" — i.e. the gateway wrote the "in-flight" sentinel and was killed before the helper could overwrite it with either a success payload or a real failure reason like managed-service-handoff-spawn-failed/managed-service-handoff-parent-timeout/managed-service-handoff-failed.

This means every systemd-supervised install hits the same race. Non-systemd installs (PM2, raw node, foreground) probably work because the parent doesn't drag the helper down on exit.

Suggested fix

Make the helper escape the systemd unit before it's launched:

// pseudocode for the supervisor=systemd path
spawn("systemd-run", [
  "--user", "--scope",
  "--unit", `openclaw-update-${handoffId}.scope`,
  "--collect",
  process.execPath, helperPath, paramsPath,
], { detached: true, stdio: "ignore" }).unref();

Plus a sanity check: if handoffId exists in the sentinel but handoff.log was never written within N seconds after gateway start, surface a more diagnostic error like managed-service-handoff-helper-never-ran instead of leaving the in-flight "handoff-started" reason as the final state.

Impact

  • All systemctl --user-supervised installs (the recommended Linux pattern in the docs) are stuck on whatever version was first installed; the in-product Update button is a no-op.
  • The error message in the UI tells the user nothing actionable — doctorHint: "Run: openclaw doctor --non-interactive" doesn't detect the race.

Happy to provide more traces if useful. Manual upgrade path is fine for now, so this isn't urgent for me, but it's broken for everyone who finds it.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING