openclaw - 💡(How to fix) Fix respawnGatewayProcessForUpdate falsely reports mode=supervised on macOS when XPC_SERVICE_NAME is inherited from a launchd-managed parent

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On macOS, respawnGatewayProcessForUpdate() (and restartGatewayProcessWithFreshPid()) trusts detectRespawnSupervisor() to decide whether launchd will restart the gateway. The detector returns "launchd" if any of LAUNCH_JOB_LABEL, LAUNCH_JOB_NAME, XPC_SERVICE_NAME, or OPENCLAW_LAUNCHD_LABEL is set.

But XPC_SERVICE_NAME is inherited by any child process of a launchd-managed parent. When OpenClaw's GUI app (ai.openclaw.mac) spawns the gateway as a child — or any custom supervisor inherits launchd env from its own parent — the gateway misidentifies itself as launchd-supervised.

Result: gateway writes a gateway-supervisor-restart-handoff.json with supervisorMode: "launchd" and exits cleanly, expecting launchd to restart it. But launchd has no ai.openclaw.gateway service registered (only ai.openclaw.mac for the parent). The gateway never comes back.

Error Message

Catastrophic and silent: the user's chat bots, agents, and integrations all go offline with no error visible to the gateway operator. Recovery requires CLI/launchctl knowledge to discover the service is unloaded.

Root Cause

Catastrophic and silent: the user's chat bots, agents, and integrations all go offline with no error visible to the gateway operator. Recovery requires CLI/launchctl knowledge to discover the service is unloaded.

Fix Action

Workaround

Set OPENCLAW_NO_RESPAWN=1 to force in-process restart (loses the fresh-module-graph benefit on real update.run upgrades, but survives spurious dry-run restarts).

Or: make sure ai.openclaw.gateway is bootstrapped into launchd before relying on update-triggered restarts.

Code Example

const SUPERVISOR_HINTS = {
  launchd: ["LAUNCH_JOB_LABEL", "LAUNCH_JOB_NAME", "XPC_SERVICE_NAME", "OPENCLAW_LAUNCHD_LABEL"],
  // ...
};
function detectRespawnSupervisor(env = process.env, platform = process.platform) {
  if (platform === "darwin") return hasAnyHint(env, SUPERVISOR_HINTS.launchd) ? "launchd" : null;
  // ...
}

---

function respawnGatewayProcessForUpdate(opts = {}) {
  if (isTruthy(process.env.OPENCLAW_NO_RESPAWN)) return { mode: "disabled", detail: "OPENCLAW_NO_RESPAWN" };
  const supervisor = detectRespawnSupervisor(process.env);
  if (supervisor) {
    if (supervisor === "schtasks") { /* ... */ }
    return { mode: "supervised" };  // ← false positive on darwin
  }
  // fallback: spawnDetachedGatewayProcess(...)
}

---

09:32:30  update.run dry-run: current 2026.5.19 → target 2026.5.20, status=skipped
09:32:47  gateway PID 84633 receives SIGUSR1
09:33:17  drain timeout (2 tasks + 1 embedded run still active)
09:33:18  shutdown completed cleanly; "restart mode: update process respawn (supervisor restart)"
          → writes handoff.json with supervisorMode=launchd, sleeps 1500ms, exit(0)
[gap]     launchd does NOT restart anything (no ai.openclaw.gateway service registered)
09:33:21  OpenClaw GUI app fallback-spawns a new gateway child (PPID = GUI app, XPC inherited)
09:33:24  new gateway calls cleanStaleGatewayProcessesSync, kills PID 15647 (leftover on :18789)
09:33:57  another banner — yet another spawn, but hits the same bug, exits
[after]   no further gateway log entries; gateway is gone for ~2.5h until I manually
          `launchctl bootstrap`ed ai.openclaw.gateway.plist

---

function detectRespawnSupervisor(env, platform) {
  if (platform === "darwin") {
    // Only trust the openclaw-specific marker; XPC_SERVICE_NAME and friends
    // are inherited by any child of a launchd-managed process and do not
    // mean *this* process is registered as a launchd service.
    return env.OPENCLAW_LAUNCHD_LABEL?.trim() ? "launchd" : null;
  }
  // ...
}
RAW_BUFFERClick to expand / collapse

Summary

On macOS, respawnGatewayProcessForUpdate() (and restartGatewayProcessWithFreshPid()) trusts detectRespawnSupervisor() to decide whether launchd will restart the gateway. The detector returns "launchd" if any of LAUNCH_JOB_LABEL, LAUNCH_JOB_NAME, XPC_SERVICE_NAME, or OPENCLAW_LAUNCHD_LABEL is set.

But XPC_SERVICE_NAME is inherited by any child process of a launchd-managed parent. When OpenClaw's GUI app (ai.openclaw.mac) spawns the gateway as a child — or any custom supervisor inherits launchd env from its own parent — the gateway misidentifies itself as launchd-supervised.

Result: gateway writes a gateway-supervisor-restart-handoff.json with supervisorMode: "launchd" and exits cleanly, expecting launchd to restart it. But launchd has no ai.openclaw.gateway service registered (only ai.openclaw.mac for the parent). The gateway never comes back.

Environment

  • OpenClaw: confirmed in 2026.5.19 (where I first hit it) and verified still present in 2026.5.20 (currently installed) by reading dist/supervisor-markers-B5EgETF5.js and dist/cli/gateway-lifecycle.runtime.js.
  • Node: 25.x
  • OS: macOS 15 (Darwin 25.4)
  • Trigger: any user running the OpenClaw GUI app whose ai.openclaw.gateway LaunchAgent has been unloaded (e.g. by a prior mode=reload restart script that did launchctl bootout + a failed launchctl bootstrap, or by doctor's legacy-service cleanup). Trigger event: any SIGUSR1 / update.run restart, even a dry-run status=skipped one.

Code-level trace (against 2026.5.20)

dist/supervisor-markers-B5EgETF5.js:

const SUPERVISOR_HINTS = {
  launchd: ["LAUNCH_JOB_LABEL", "LAUNCH_JOB_NAME", "XPC_SERVICE_NAME", "OPENCLAW_LAUNCHD_LABEL"],
  // ...
};
function detectRespawnSupervisor(env = process.env, platform = process.platform) {
  if (platform === "darwin") return hasAnyHint(env, SUPERVISOR_HINTS.launchd) ? "launchd" : null;
  // ...
}

dist/cli/gateway-lifecycle.runtime.js:

function respawnGatewayProcessForUpdate(opts = {}) {
  if (isTruthy(process.env.OPENCLAW_NO_RESPAWN)) return { mode: "disabled", detail: "OPENCLAW_NO_RESPAWN" };
  const supervisor = detectRespawnSupervisor(process.env);
  if (supervisor) {
    if (supervisor === "schtasks") { /* ... */ }
    return { mode: "supervised" };  // ← false positive on darwin
  }
  // fallback: spawnDetachedGatewayProcess(...)
}

Observed sequence

09:32:30  update.run dry-run: current 2026.5.19 → target 2026.5.20, status=skipped
09:32:47  gateway PID 84633 receives SIGUSR1
09:33:17  drain timeout (2 tasks + 1 embedded run still active)
09:33:18  shutdown completed cleanly; "restart mode: update process respawn (supervisor restart)"
          → writes handoff.json with supervisorMode=launchd, sleeps 1500ms, exit(0)
[gap]     launchd does NOT restart anything (no ai.openclaw.gateway service registered)
09:33:21  OpenClaw GUI app fallback-spawns a new gateway child (PPID = GUI app, XPC inherited)
09:33:24  new gateway calls cleanStaleGatewayProcessesSync, kills PID 15647 (leftover on :18789)
09:33:57  another banner — yet another spawn, but hits the same bug, exits
[after]   no further gateway log entries; gateway is gone for ~2.5h until I manually
          `launchctl bootstrap`ed ai.openclaw.gateway.plist

Why this matters

Catastrophic and silent: the user's chat bots, agents, and integrations all go offline with no error visible to the gateway operator. Recovery requires CLI/launchctl knowledge to discover the service is unloaded.

Proposed fix

In src/infra/supervisor-markers.ts, narrow darwin detection to OpenClaw's own explicit marker so inherited generic launchd env vars don't trigger a false positive:

function detectRespawnSupervisor(env, platform) {
  if (platform === "darwin") {
    // Only trust the openclaw-specific marker; XPC_SERVICE_NAME and friends
    // are inherited by any child of a launchd-managed process and do not
    // mean *this* process is registered as a launchd service.
    return env.OPENCLAW_LAUNCHD_LABEL?.trim() ? "launchd" : null;
  }
  // ...
}

For belt-and-suspenders: before returning "launchd", optionally verify the service is actually registered via launchctl print "gui/$(id -u)/$LABEL".

Operators who run gateway under launchd should ensure ai.openclaw.gateway.plist sets OPENCLAW_LAUNCHD_LABEL=ai.openclaw.gateway in its EnvironmentVariables. Worth adding this to the bundled plist generator too, so the marker is set by default.

Related (not duplicates)

  • #52313 (open) — covers the unmanaged / spawned path: pnpm versioned realpaths becoming unstable after self-update. This issue is the opposite direction: the supervised path falsely chosen.
  • #65668 (closed) — covers detectRespawnSupervisor returning null under a custom supervisor, leading to orphan + EADDRINUSE. This issue is detectRespawnSupervisor returning "launchd" when it shouldn't.

Workaround

Set OPENCLAW_NO_RESPAWN=1 to force in-process restart (loses the fresh-module-graph benefit on real update.run upgrades, but survives spurious dry-run restarts).

Or: make sure ai.openclaw.gateway is bootstrapped into launchd before relying on update-triggered restarts.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix respawnGatewayProcessForUpdate falsely reports mode=supervised on macOS when XPC_SERVICE_NAME is inherited from a launchd-managed parent