openclaw - 💡(How to fix) Fix macOS launchd: plugin/config maintenance can leave Control UI in broken partial-reload state; restart may require doctor --fix

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On macOS launchd installs, doing plugin/config maintenance from a live session can leave Gateway in a partially reloaded state where:

  • Control UI / webchat opens but shows "unable to connect" / unknown error
  • sessions.resolve returns INVALID_REQUEST: No session found
  • openclaw gateway restart can fail to recover cleanly
  • openclaw doctor --fix is required to restore a healthy state

This looks like a product gap in restart/reload resilience, even though the triggering action was an operator mistake.

Error Message

From logs on 2026-05-25:

Root Cause

On macOS launchd installs, doing plugin/config maintenance from a live session can leave Gateway in a partially reloaded state where:

  • Control UI / webchat opens but shows "unable to connect" / unknown error
  • sessions.resolve returns INVALID_REQUEST: No session found
  • openclaw gateway restart can fail to recover cleanly
  • openclaw doctor --fix is required to restore a healthy state

This looks like a product gap in restart/reload resilience, even though the triggering action was an operator mistake.

Code Example

2026-05-25T17:26:25.777+08:00 [gateway] http server listening (5 plugins: acpx, browser, codex, memory-core, openai; 2.1s)
2026-05-25T17:26:56.412+08:00 [ws] ⇄ res ✗ sessions.resolve 1ms errorCode=INVALID_REQUEST errorMessage=No session found
2026-05-25T17:29:27.999+08:00 [gateway/reload] restart still deferred after 30092ms with 2 operation(s), 1 embedded run(s) active
2026-05-25T17:30:58.267+08:00 [gateway/reload] restart still deferred after 120360ms with 2 operation(s), 1 embedded run(s) active
2026-05-25T17:27:28.245+08:00 killing 1 stale gateway process(es) before restart: 45762
2026-05-25T17:32:03.528+08:00 [gateway] http server listening (7 plugins: acpx, browser, codex, feishu, memory-core, openai, openclaw-weixin; 2.7s)
RAW_BUFFERClick to expand / collapse

Summary

On macOS launchd installs, doing plugin/config maintenance from a live session can leave Gateway in a partially reloaded state where:

  • Control UI / webchat opens but shows "unable to connect" / unknown error
  • sessions.resolve returns INVALID_REQUEST: No session found
  • openclaw gateway restart can fail to recover cleanly
  • openclaw doctor --fix is required to restore a healthy state

This looks like a product gap in restart/reload resilience, even though the triggering action was an operator mistake.

Environment

  • Host OS: macOS
  • OpenClaw: 2026.5.22
  • Service mode: launchd / LaunchAgent
  • Gateway port: 18789
  • Control UI client: local webchat / openclaw-control-ui

What happened

I was fixing config/plugin warnings from a live front-channel session and performed:

After that, Control UI became unreliable:

  • webchat opened but showed connection failure / unknown error
  • websocket/API activity showed repeated sessions.resolve failures
  • one manual openclaw gateway restart reportedly failed
  • openclaw doctor --fix was needed before restart/startup recovered reliably

Observed behavior

From logs on 2026-05-25:

  1. Gateway entered an intermediate startup with only 5 plugins loaded:
    • acpx, browser, codex, memory-core, openai
    • channel plugins were not yet fully back
  2. During that window, Control UI requests hit:
    • sessions.resolve -> INVALID_REQUEST: No session found
  3. Restart was deferred for an extended period while operations/runs were still active:
    • restart still deferred after 30092ms
    • restart still deferred after 60270ms
    • restart still deferred after 90315ms
    • restart still deferred after 120360ms
  4. A stale process had to be killed before restart:
    • killing 1 stale gateway process(es) before restart: 45762
  5. Only later did Gateway come back fully with 7 plugins:
    • acpx, browser, codex, feishu, memory-core, openai, openclaw-weixin

This left the user-facing Control UI in a bad state that surfaced as a generic connection failure instead of a clear "gateway is reloading / session index unavailable / retry shortly" state.

Expected behavior

  • Plugin/config maintenance should not leave Control UI in a partially alive but broken state
  • If Gateway is mid-reload, Control UI should get a clear degraded-state response instead of generic unknown error
  • openclaw gateway restart should recover from stale processes/port handoff more robustly
  • doctor --fix should not be the normal escape hatch for restart failures

Why this seems like a product issue

Yes, the trigger here was operator error: maintenance was performed from a live user session and included disruptive plugin uninstall/install operations.

But the product behavior still seems wrong:

  • partial startup is externally visible before runtime is truly healthy
  • session APIs can fail with No session found during reload windows
  • stale gateway processes can survive long enough to poison restart
  • recovery path appears to depend on doctor --fix

Candidate areas to inspect

  • launchd restart / handoff flow on macOS
  • stale process cleanup before/after restart
  • readiness gating before exposing Control UI as healthy
  • behavior of session registry/index during partial reload
  • plugin reload sequencing when channel plugins are temporarily absent

Relevant log snippets

2026-05-25T17:26:25.777+08:00 [gateway] http server listening (5 plugins: acpx, browser, codex, memory-core, openai; 2.1s)
2026-05-25T17:26:56.412+08:00 [ws] ⇄ res ✗ sessions.resolve 1ms errorCode=INVALID_REQUEST errorMessage=No session found
2026-05-25T17:29:27.999+08:00 [gateway/reload] restart still deferred after 30092ms with 2 operation(s), 1 embedded run(s) active
2026-05-25T17:30:58.267+08:00 [gateway/reload] restart still deferred after 120360ms with 2 operation(s), 1 embedded run(s) active
2026-05-25T17:27:28.245+08:00 killing 1 stale gateway process(es) before restart: 45762
2026-05-25T17:32:03.528+08:00 [gateway] http server listening (7 plugins: acpx, browser, codex, feishu, memory-core, openai, openclaw-weixin; 2.7s)

Repro rough sketch

  1. Use macOS launchd-managed Gateway
  2. From an active front-channel session, perform config changes plus plugin uninstall/install operations affecting loaded plugins/channels
  3. Open Control UI while Gateway is mid-reload
  4. Observe possible unknown-error connection state and sessions.resolve failures
  5. Try openclaw gateway restart
  6. In some cases, recovery may require openclaw doctor --fix

Notes

I do not think the correct answer is "users should never do live maintenance"; that is good operational advice, but the runtime should still fail more clearly and recover more deterministically than this.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • Plugin/config maintenance should not leave Control UI in a partially alive but broken state
  • If Gateway is mid-reload, Control UI should get a clear degraded-state response instead of generic unknown error
  • openclaw gateway restart should recover from stale processes/port handoff more robustly
  • doctor --fix should not be the normal escape hatch for restart failures

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING