openclaw - 💡(How to fix) Fix Upgrade leaves stale user-level systemd unit, dueling services kill each other on Linux

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

After upgrading from an older OpenClaw release that installed a user-level systemd unit (~/.config/systemd/user/openclaw-gateway.service) to a newer release that installs a system-level unit (/etc/systemd/system/openclaw-gateway.service), both units remain registered and try to manage the gateway. They bind to the same port and trigger each other's stale-process detection, producing an endless restart cascade.

Error Message

Bonus: the stale-process detection during startup is what makes this loop catastrophic rather than just noisy. Consider gating "kill stale process on this port" behind a check that the existing process isn't a sibling systemd-managed instance, or fail-fast with a clear error instead of silently SIGTERMing it.

Root Cause

After upgrading from an older OpenClaw release that installed a user-level systemd unit (~/.config/systemd/user/openclaw-gateway.service) to a newer release that installs a system-level unit (/etc/systemd/system/openclaw-gateway.service), both units remain registered and try to manage the gateway. They bind to the same port and trigger each other's stale-process detection, producing an endless restart cascade.

Fix Action

Workaround

systemctl --user stop openclaw-gateway
systemctl --user disable openclaw-gateway
# optional: rm ~/.config/systemd/user/openclaw-gateway.service
sudo systemctl restart openclaw-gateway

Code Example

[restart] killing 1 stale gateway process(es) before restart: <pid>
  [gateway] signal SIGTERM received
  [gateway] received SIGTERM; shutting down
  [gateway] shutdown completed cleanly in <ms>ms

---

systemctl --user stop openclaw-gateway
systemctl --user disable openclaw-gateway
# optional: rm ~/.config/systemd/user/openclaw-gateway.service
sudo systemctl restart openclaw-gateway
RAW_BUFFERClick to expand / collapse

Summary

After upgrading from an older OpenClaw release that installed a user-level systemd unit (~/.config/systemd/user/openclaw-gateway.service) to a newer release that installs a system-level unit (/etc/systemd/system/openclaw-gateway.service), both units remain registered and try to manage the gateway. They bind to the same port and trigger each other's stale-process detection, producing an endless restart cascade.

Environment

  • Platform: WSL2 (Ubuntu 24.04) on Windows 11
  • Old version: 2026.4.25 (installed user-level unit)
  • New version: 2026.5.7 (installed system-level unit, with gemini.conf drop-in and User=azfar)
  • Install method: npm install -g openclaw followed by openclaw doctor / setup at upgrade time

Repro

  1. Install a release that creates ~/.config/systemd/user/openclaw-gateway.service.
  2. Upgrade to a newer release that switches to a system-level unit at /etc/systemd/system/openclaw-gateway.service (needed for User=azfar and env drop-ins like GEMINI_CLI_TRUST_WORKSPACE).
  3. Both units now exist. The user-level unit may sit in failed (start-limit-hit) and look harmless.
  4. Trigger any restart (e.g. systemctl --user restart openclaw-gateway) — this resets the user unit's failure state and starts it.
  5. Both units now run a node process bound to port 18789. Each new instance sees the other as a "stale gateway process" and SIGTERMs it. The killed instance exits cleanly (status 0), Restart=always brings it back, the cycle continues.

Symptoms

  • Telegram bot stops responding.
  • journalctl shows repeated:
    [restart] killing 1 stale gateway process(es) before restart: <pid>
    [gateway] signal SIGTERM received
    [gateway] received SIGTERM; shutting down
    [gateway] shutdown completed cleanly in <ms>ms
  • New PID every ~6–90 seconds.
  • Telegram API hits 429 (setMyCommands/deleteMyCommands rate limited) within a few cycles, which initially looks like the cause but is actually a symptom.
  • systemctl --user list-units shows the legacy user unit; systemctl list-units shows the new system unit. They don't see each other.

Workaround

systemctl --user stop openclaw-gateway
systemctl --user disable openclaw-gateway
# optional: rm ~/.config/systemd/user/openclaw-gateway.service
sudo systemctl restart openclaw-gateway

Suggested fix

During setup/upgrade, when promoting from a user-level unit to a system-level unit:

  1. Detect a pre-existing ~/.config/systemd/user/openclaw-gateway.service.
  2. Stop and disable it (systemctl --user stop/disable openclaw-gateway).
  3. Optionally remove the unit file or leave a backup with a clear note.
  4. Surface what happened in the setup output so the user knows why their old unit was retired.

Bonus: the stale-process detection during startup is what makes this loop catastrophic rather than just noisy. Consider gating "kill stale process on this port" behind a check that the existing process isn't a sibling systemd-managed instance, or fail-fast with a clear error instead of silently SIGTERMing it.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING