openclaw - 💡(How to fix) Fix Mid-flight pnpm upgrade leaves gateway dead with no launchd respawn (macOS); plus plaintext secrets in plist [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72996Fetched 2026-04-28 06:28:57
View on GitHub
Comments
3
Participants
2
Timeline
9
Reactions
0
Author
Participants
Timeline (top)
labeled ×4commented ×3closed ×1cross-referenced ×1

When the npm-global package is replaced in place by pnpm while a 2026.4.24 gateway is running, the running process eventually SIGTERMs itself due to a config validator that lacks the recovery branch its sibling has — and then the LaunchAgent never comes back, because the install path that rewrites the plist does not run launchctl bootstrap. Result: 3-hour silent outage of every channel (Discord, Telegram, Slack…) until manual openclaw gateway install --force.

Error Message

Invalid config at /Users/Gandalf/.openclaw/openclaw.json:

  • plugins.entries.feishu: plugin feishu: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load
  • plugins.entries.whatsapp: plugin whatsapp: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load [gateway] shutdown error: Error: Invalid config at /Users/Gandalf/.openclaw/openclaw.json: …

Root Cause

When the npm-global package is replaced in place by pnpm while a 2026.4.24 gateway is running, the running process eventually SIGTERMs itself due to a config validator that lacks the recovery branch its sibling has — and then the LaunchAgent never comes back, because the install path that rewrites the plist does not run launchctl bootstrap. Result: 3-hour silent outage of every channel (Discord, Telegram, Slack…) until manual openclaw gateway install --force.

Code Example

Invalid config at /Users/Gandalf/.openclaw/openclaw.json:
   - plugins.entries.feishu: plugin feishu: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load
   - plugins.entries.whatsapp: plugin whatsapp: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load
   [gateway] shutdown error: Error: Invalid config at /Users/Gandalf/.openclaw/openclaw.json:
RAW_BUFFERClick to expand / collapse

Mid-flight pnpm upgrade leaves gateway dead with no launchd respawn (macOS)

Versions: running 2026.4.24 → upgrade target 2026.4.25 (latest) Platform: macOS 26.4.1 arm64, node 24.13.0, pnpm Service mode: LaunchAgent (ai.openclaw.gateway, KeepAlive=true)

Summary

When the npm-global package is replaced in place by pnpm while a 2026.4.24 gateway is running, the running process eventually SIGTERMs itself due to a config validator that lacks the recovery branch its sibling has — and then the LaunchAgent never comes back, because the install path that rewrites the plist does not run launchctl bootstrap. Result: 3-hour silent outage of every channel (Discord, Telegram, Slack…) until manual openclaw gateway install --force.

Reproduction (observed timeline)

  1. 08:47:39 — homebase profile gateway crashes with ERR_MODULE_NOT_FOUND while pnpm is rewriting ~/.npm-global/lib/node_modules/openclaw/dist/ in place. Sidecar tries to import a hashed bundle that the upgrade has already deleted: Cannot find module '.../dist/internal-hooks-CVZWEVD3.js' imported from .../server.impl-CtLS1ywt.js.

  2. 08:47:42 — pnpm finishes; on-disk binary is now 2026.4.25, both running gateway processes are still 2026.4.24 in memory.

  3. 08:47:45 — main gateway hits a config reload. New 2026.4.25 release added plugins.entries.feishu and plugins.entries.whatsapp to openclaw.json; both declare requires OpenClaw >=2026.4.25. The reload path gracefully skips: [reload] config reload recovery skipped after invalid-config: invalidity is scoped to plugin entries.

  4. 08:49:13.307 — gateway receives SIGTERM (presumably from the upgrade finalizer).

  5. 08:49:13.471 — shutdown path strictly re-validates config and re-throws on the same plugin version constraint:

    Invalid config at /Users/Gandalf/.openclaw/openclaw.json:
    - plugins.entries.feishu: plugin feishu: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load
    - plugins.entries.whatsapp: plugin whatsapp: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load
    [gateway] shutdown error: Error: Invalid config at /Users/Gandalf/.openclaw/openclaw.json: …
  6. 08:49:35~/Library/LaunchAgents/ai.openclaw.gateway.plist is rewritten with Comment: OpenClaw Gateway (v2026.4.25). launchctl bootstrap is never invoked, so the unit silently disappears from the GUI domain.

  7. 08:49 → 11:49 — KeepAlive cannot restart a unit launchd no longer knows about. Three-hour outage. launchctl print gui/$UID/ai.openclaw.gateway returns 503: Could not find service.

  8. 11:49 — manual openclaw gateway install --force re-bootstraps the unit and the gateway returns. Channels reconnect (with the staggered Discord delays).

Root cause(s)

  1. Reload path has a scoped to plugin entries recovery branch; shutdown path does not. The same version-floor error is non-fatal in one path and fatal in the other. The shutdown path should at minimum match reload's tolerance, since by definition a 2026.4.24 process can never satisfy a >=2026.4.25 floor.
  2. The install/upgrade orchestration rewrites the LaunchAgent plist but never runs launchctl bootstrap gui/$UID … (and there's no bootout of the previous state either). KeepAlive only protects against process crashes, not against the unit being unloaded.
  3. In-place node_modules replacement during the running process's lifetime lets sidecars see partial trees mid-upgrade, producing ERR_MODULE_NOT_FOUND on hashed bundles. The upgrade should stop the gateway before mutating node_modules/openclaw.json, not concurrently.

Suggested fixes

  • Add the same [reload] recovery branch to the shutdown-time config validator (or downgrade plugin-entry version-floor errors to warnings everywhere).
  • Make the upgrade hook explicitly run launchctl bootout gui/$UID/ai.openclaw.<label>; launchctl bootstrap gui/$UID <plist> (and equivalents on systemd / Windows Task Scheduler). Treat plist replacement and bootstrap as one atomic step.
  • Sequence the upgrade: bootout → replace node_modules → write new config → bootstrap. Don't replace node_modules while the old process is alive.
  • For belt-and-braces: a watchdog (separate plist) that periodically checks launchctl print and re-bootstraps if missing.

Plaintext secrets embedded in LaunchAgent plist (mode 644)

Filing as a separate concern; happy to split into its own issue.

openclaw gateway install --force on macOS writes 30 env vars directly into ~/Library/LaunchAgents/ai.openclaw.gateway.plist, including: 6 Discord bot tokens, 6 LLM/search API keys (XAI/Brave/MiniMax/Gemini/OpenRouter/GoogleMaps), 5 Xero financial keys, Etsy keys (×2), Notion, Resend, the gateway token, and GOG_KEYRING_PASSWORD. The installer enumerates them via OPENCLAW_SERVICE_MANAGED_ENV_KEYS, so this is intentional — but the plist ships at mode 644 (world-readable on a multi-user host or any process running as a different uid). The openclaw doctor warning specifically calls out OPENCLAW_GATEWAY_TOKEN but quietly tolerates everything else.

Concretely surprising: a keyring password (GOG_KEYRING_PASSWORD) ends up in plaintext on disk, which defeats the purpose of having a keyring.

Suggestions:

  • Default plist mode to 600 on macOS (and equivalent ACL on systemd / Windows).
  • Provide a SecretRef-aware launchd writer: write secret values to ~/.openclaw/service-env (mode 600), have the plist call a launcher shim that sources it. Alternatively, use the macOS user keychain via security add-generic-password and a tiny wrapper that looks them up at start.
  • openclaw gateway install --force should warn loudly when it finds keyring-shaped names (*_KEYRING_PASSWORD, *_MASTER_KEY, etc) about to be embedded.
  • The Gateway service embeds OPENCLAW_GATEWAY_TOKEN and should be reinstalled doctor message implies the fix is install --force, but install --force does not strip the embedding — it just rewrites the same plist. Either the message should be updated or the install path should actually strip.

extent analysis

TL;DR

To prevent the gateway from dying after a pnpm upgrade, run launchctl bootstrap after rewriting the LaunchAgent plist and ensure the shutdown path has the same recovery branch as the reload path.

Guidance

  • Add a recovery branch to the shutdown-time config validator to match the reload path's tolerance for version-floor errors.
  • Modify the upgrade hook to run launchctl bootout and launchctl bootstrap after rewriting the LaunchAgent plist to ensure the unit is reloaded.
  • Sequence the upgrade to stop the gateway before replacing node_modules, writing a new config, and then bootstrapping to prevent ERR_MODULE_NOT_FOUND errors.
  • Consider adding a watchdog to periodically check the unit's status and re-bootstrap if missing.

Example

No code snippet is provided as the issue is more related to the upgrade and launch process rather than a specific code fix.

Notes

The provided suggestions focus on the primary issue of the gateway dying after a pnpm upgrade. The secondary concern about plaintext secrets in the LaunchAgent plist is acknowledged but not addressed in the guidance, as it requires a separate solution.

Recommendation

Apply the suggested fixes to the upgrade and launch process to prevent the gateway from dying after a pnpm upgrade. This includes adding a recovery branch to the shutdown-time config validator, modifying the upgrade hook, and sequencing the upgrade to prevent errors.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Mid-flight pnpm upgrade leaves gateway dead with no launchd respawn (macOS); plus plaintext secrets in plist [3 comments, 2 participants]