openclaw - ✅(Solved) Fix [Bug]: macOS LaunchAgent can be removed and left not loaded after failed `openclaw gateway start` [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#52208Fetched 2026-04-08 01:14:16
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
cross-referenced ×2labeled ×2commented ×1referenced ×1

I hit a failure mode where the OpenClaw Gateway stopped and did not automatically recover.

From logs and source inspection, this looks like a bug in the macOS LaunchAgent management / recovery path rather than an OS-level issue.

The key symptom is:

  • OpenClaw attempted a gateway service start/restart flow
  • launchctl kickstart -k gui/501/ai.openclaw.gateway failed
  • launchd then marked the service inactive and removed it
  • the LaunchAgent remained not loaded
  • the gateway did not auto-recover until manually re-enabled later

This leaves the service in a broken unmanaged state.

Error Message

At around 2026-03-22 13:29 +08:00, the gateway exited and did not come back automatically.

Root Cause

Impact

This is a serious reliability issue because it can turn a recoverable restart/start error into a prolonged outage, with the service left unmanaged by launchd.

Fix Action

Fixed

PR fix notes

PR #52245: fix(launchd): re-bootstrap LaunchAgent when kickstart failure leaves service unloaded

Description (problem / solution / changelog)

Summary

Fixes #52208

When launchctl kickstart -k fails during openclaw gateway start on macOS, launchd can mark the service as inactive and remove it. Previously, the error was thrown without checking whether the service was still loaded, leaving the LaunchAgent in an unmanaged state with no automatic recovery — the gateway stayed down until manual intervention.

Changes

src/daemon/launchd.ts

  • Added ensureLaunchAgentLoadedAfterFailure() helper that probes the service with launchctl print after a kickstart failure, and attempts to re-bootstrap if the service was removed/unloaded
  • Applied this guard in both error paths of restartLaunchAgent():
    1. When kickstart fails with a non-"not loaded" error (the primary bug path)
    2. When the retry kickstart after bootstrap also fails

src/daemon/launchd.test.ts

  • Added test: "re-bootstraps when kickstart failure leaves the service unloaded (#52208)" — verifies that enable + bootstrap are called when launchctl print shows the service was removed after a failed kickstart
  • Added test: "skips re-bootstrap when kickstart fails but service is still loaded (#52208)" — verifies no unnecessary re-bootstrap when the service remains loaded despite the kickstart failure
  • Added printNotLoadedRemaining mock state to simulate launchd removing the service

Behavior

ScenarioBeforeAfter
kickstart fails, service removed by launchdService left unloaded, gateway down until manual fixRe-bootstraps service, then throws error (service stays managed)
kickstart fails, service still loadedThrows error (correct)Throws error, no unnecessary re-bootstrap (correct)
Re-bootstrap also failsN/ABest-effort: still throws original kickstart error

All 27 tests pass.

Changed files

  • src/daemon/launchd.test.ts (modified, +46/-0)
  • src/daemon/launchd.ts (modified, +38/-0)

Code Example

2026-03-22T13:29:31.279+08:00 Gateway start failed: Error: launchctl kickstart failed: Command failed: launchctl kickstart -k gui/501/ai.openclaw.gateway
2026-03-22T13:29:33.501+08:00 [gateway] shutdown timed out; exiting without full cleanup

---

2026-03-22 13:29:28.487 launchctl print
2026-03-22 13:29:28.490 launchctl bootout
2026-03-22 13:29:28.493 launchctl print
2026-03-22 13:29:31.274 launchctl kickstart
2026-03-22 13:29:33.535 launchd: service inactive: ai.openclaw.gateway
2026-03-22 13:29:33.535 launchd: removing service: ai.openclaw.gateway

---

Service: LaunchAgent (not loaded)
Service not installed. Run: openclaw gateway install

---

fail(`${params.serviceNoun} start failed: ${String(err)}`, hints);

---

const start = await execLaunchctl(["kickstart", "-k", serviceTarget]);
...
if (!isLaunchctlNotLoaded(start)) {
  throw new Error(`launchctl kickstart failed: ${start.stderr || start.stdout}`.trim());
}
...
const retry = await execLaunchctl(["kickstart", "-k", serviceTarget]);
if (retry.code !== 0) {
  throw new Error(`launchctl kickstart failed: ${retry.stderr || retry.stdout}`.trim());
}

---

Gateway start failed: Error: launchctl kickstart failed ...

---

openclaw gateway install
   openclaw gateway start

---

openclaw gateway status
   launchctl print gui/$UID/ai.openclaw.gateway

---

openclaw gateway start

---

Gateway start failed: Error: launchctl kickstart failed: Command failed: launchctl kickstart -k gui/501/ai.openclaw.gateway

---

service inactive: ai.openclaw.gateway
   removing service: ai.openclaw.gateway

---

openclaw gateway status
   launchctl print gui/$UID/ai.openclaw.gateway

---
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Summary

Environment

  • OpenClaw version: 2026.3.13 (61d171a)
  • OS: macOS
  • Service manager: launchd / LaunchAgent
  • Service plist: ~/Library/LaunchAgents/ai.openclaw.gateway.plist

Summary

I hit a failure mode where the OpenClaw Gateway stopped and did not automatically recover.

From logs and source inspection, this looks like a bug in the macOS LaunchAgent management / recovery path rather than an OS-level issue.

The key symptom is:

  • OpenClaw attempted a gateway service start/restart flow
  • launchctl kickstart -k gui/501/ai.openclaw.gateway failed
  • launchd then marked the service inactive and removed it
  • the LaunchAgent remained not loaded
  • the gateway did not auto-recover until manually re-enabled later

This leaves the service in a broken unmanaged state.

Observed behavior

At around 2026-03-22 13:29 +08:00, the gateway exited and did not come back automatically.

Gateway/app log evidence

From /tmp/openclaw/openclaw-2026-03-22.log:

2026-03-22T13:29:31.279+08:00 Gateway start failed: Error: launchctl kickstart failed: Command failed: launchctl kickstart -k gui/501/ai.openclaw.gateway
2026-03-22T13:29:33.501+08:00 [gateway] shutdown timed out; exiting without full cleanup

launchd / unified log evidence

From macOS unified logs around the same time:

2026-03-22 13:29:28.487 launchctl print
2026-03-22 13:29:28.490 launchctl bootout
2026-03-22 13:29:28.493 launchctl print
2026-03-22 13:29:31.274 launchctl kickstart
2026-03-22 13:29:33.535 launchd: service inactive: ai.openclaw.gateway
2026-03-22 13:29:33.535 launchd: removing service: ai.openclaw.gateway

Later state

At around 16:57 +08:00, OpenClaw reported:

Service: LaunchAgent (not loaded)
Service not installed. Run: openclaw gateway install

So after the failed management sequence, the LaunchAgent was left not loaded and the gateway stayed down for hours until manually re-enabled.

Why this looks like an OpenClaw bug

This does not look like a generic macOS failure.

The evidence suggests OpenClaw entered a service management path on macOS, hit a launchctl kickstart failure, and then failed to recover safely.

A robust implementation should not leave the service in a removed/unloaded state after a failed restart/start attempt.

Even if launchctl returns an error, OpenClaw should either:

  1. successfully restore the LaunchAgent with bootstrap, or
  2. roll back to a still-managed state, or
  3. fail loudly but avoid leaving the service removed and unmanaged

Right now it seems possible for the sequence to end in:

  • bootout
  • failed kickstart
  • service removed
  • no successful re-bootstrap
  • no auto-recovery

Source inspection

From the installed daemon-cli.js, the failure text matches the macOS LaunchAgent restart path.

runServiceStart(...) emits:

fail(`${params.serviceNoun} start failed: ${String(err)}`, hints);

And macOS restartLaunchAgent(...) does:

const start = await execLaunchctl(["kickstart", "-k", serviceTarget]);
...
if (!isLaunchctlNotLoaded(start)) {
  throw new Error(`launchctl kickstart failed: ${start.stderr || start.stdout}`.trim());
}
...
const retry = await execLaunchctl(["kickstart", "-k", serviceTarget]);
if (retry.code !== 0) {
  throw new Error(`launchctl kickstart failed: ${retry.stderr || retry.stdout}`.trim());
}

This is consistent with the runtime error:

Gateway start failed: Error: launchctl kickstart failed ...

So the observed behavior appears to map directly to OpenClaw’s LaunchAgent management code path.

Expected behavior

If openclaw gateway start / restart logic hits a launchctl failure, OpenClaw should not leave the gateway service removed/unloaded.

Expected outcomes:

  • the LaunchAgent remains loaded and recoverable, or
  • OpenClaw explicitly re-bootstraps it successfully, or
  • the command fails with actionable output but does not orphan the service

Actual behavior

A failed start/restart path appears able to leave the service in this bad state:

  • LaunchAgent removed from launchd
  • service not loaded
  • gateway down
  • no automatic recovery

Suggested areas to inspect

  1. macOS launchd restart/start logic

    • especially bootout / kickstart / bootstrap sequencing
  2. Failure handling after kickstart error

    • avoid leaving the service removed/unmanaged
  3. Race/ordering issues

    • especially around stop/restart when the current gateway process is still shutting down
  4. Fallback logic

    • if kickstart -k fails, verify whether a bootstrap actually occurred and whether the job is loaded afterward
  5. State validation after service operations

    • explicitly confirm the LaunchAgent is loaded after restart/start; if not, repair immediately

Repro hints

I do not yet have a minimal repro, but the failure sequence observed was:

  1. gateway service management path triggered on macOS
  2. launchctl bootout
  3. launchctl kickstart -k gui/501/ai.openclaw.gateway
  4. kickstart failure
  5. service becomes inactive and removed
  6. gateway stays down until manual intervention

Impact

This is a serious reliability issue because it can turn a recoverable restart/start error into a prolonged outage, with the service left unmanaged by launchd.

Steps to reproduce

Reproduction steps for OpenClaw macOS LaunchAgent failure

Attempted reproduction steps

I do not yet have a fully minimal deterministic repro, but the observed sequence can be approximated as follows on macOS:

  1. Install and run OpenClaw as a LaunchAgent:

    openclaw gateway install
    openclaw gateway start
  2. Ensure the gateway is running under launchd:

    openclaw gateway status
    launchctl print gui/$UID/ai.openclaw.gateway
  3. Trigger a gateway service management action while the gateway is active. Based on the observed behavior, this appears to involve the same code path used by:

    openclaw gateway start

    on an already-managed gateway service.

  4. During the failure window, the following launchctl sequence was observed:

    • launchctl print
    • launchctl bootout
    • launchctl print
    • launchctl kickstart -k gui/$UID/ai.openclaw.gateway
  5. If the kickstart step fails, OpenClaw reports:

    Gateway start failed: Error: launchctl kickstart failed: Command failed: launchctl kickstart -k gui/501/ai.openclaw.gateway
  6. After that, launchd may remove the service:

    service inactive: ai.openclaw.gateway
    removing service: ai.openclaw.gateway
  7. Verify the broken state:

    openclaw gateway status
    launchctl print gui/$UID/ai.openclaw.gateway

    Observed result:

    • LaunchAgent becomes not loaded
    • gateway is down
    • no automatic recovery occurs

Notes

  • I have not yet isolated the exact minimal trigger that makes kickstart fail.
  • However, the above sequence matches the real incident logs.
  • The bug is therefore not "gateway crashed and launchd forgot to restart it", but rather:
    • a service management flow was entered,
    • kickstart failed,
    • and the service was left removed/unmanaged.

Expected behavior

The service was expected to start normally

Actual behavior

But it did not actually start

OpenClaw version

2026.3.13 (61d171a)

Operating system

macOS 26.3.1 (25D2128)

Install method

No response

Model

gihub-copilot/gpt-5.4

Provider / routing chain

openclaw-gihub-copilot

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

Fix Plan

To address the issue of OpenClaw Gateway not automatically recovering after a failed start/restart attempt on macOS, we need to modify the service management logic. The goal is to ensure that even if launchctl kickstart fails, the service remains in a managed state and can recover.

Step 1: Modify Failure Handling

Update the restartLaunchAgent function to handle launchctl kickstart failures more robustly. If kickstart fails, attempt to bootstrap the service to ensure it remains loaded.

const restartLaunchAgent = async (serviceTarget) => {
  try {
    // Existing kickstart logic
    const start = await execLaunchctl(["kickstart", "-k", serviceTarget]);
    // ...
  } catch (error) {
    // If kickstart fails, attempt to bootstrap the service
    try {
      await execLaunchctl(["bootstrap", serviceTarget]);
      console.log(`Successfully bootstrapped ${serviceTarget} after kickstart failure.`);
    } catch (bootstrapError) {
      console.error(`Failed to bootstrap ${serviceTarget} after kickstart failure: ${bootstrapError}`);
      // Consider additional error handling or fallbacks here
    }
  }
};

Step 2: Validate Service State

After any service operation (start, restart, stop), validate that the LaunchAgent is indeed loaded. If not, attempt to load it.

const validateAndLoadService = async (serviceTarget) => {
  const serviceStatus = await execLaunchctl(["print", serviceTarget]);
  if (serviceStatus.stdout.includes("Not loaded")) {
    try {
      await execLaunchctl(["load", serviceTarget]);
      console.log(`Loaded ${serviceTarget} as it was not loaded.`);
    } catch (loadError) {
      console.error(`Failed to load ${serviceTarget}: ${loadError}`);
    }
  }
};

Step 3: Integrate Validation into Service Management

Call validateAndLoadService after any service management operation to ensure the service remains in a loaded state.

// Example integration after starting the service
const startService = async (serviceTarget) => {
  try {
    // Existing start logic
    await restartLaunchAgent(serviceTarget);
    await validateAndLoadService(serviceTarget);
  } catch (error) {
    console.error(`Error starting ${serviceTarget}: ${error}`);
  }
};

Verification

To verify that the fix worked:

  1. Attempt to reproduce the failure by following the steps outlined in the issue description.
  2. After the kickstart failure, check the service status using launchctl print gui/$UID/ai.openclaw.gateway.
  3. The service should now remain loaded or become loaded after the failure, indicating successful recovery.

Extra Tips

  • Regularly review service management logs to catch any potential issues early.
  • Consider implementing additional logging or

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The service was expected to start normally

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: macOS LaunchAgent can be removed and left not loaded after failed `openclaw gateway start` [1 pull requests, 1 comments, 2 participants]