openclaw - ✅(Solved) Fix [BUG] macOS LaunchAgent + configure wizard creates duplicate gateway process, causing 30+ hour Telegram 409 polling conflict [2 pull requests, 1 comments, 2 participants]

michael-ruffolo · 2026-03-12T03:11:31Z

[openclaw] Running openclaw configure wizard while the LaunchAgent is already managing a running gateway process leaves the original process alive. Both instan… Running `openclaw configure` (wizard) while the LaunchAgent is already managing a running gateway process leaves the original process alive. Both instances then attempt to long-poll Telegram simultaneously, producing continuous 409 Conflict errors that silently drop or misroute incoming messages for as long as both processes are alive. # PR #43639: Gateway: prevent detached respawn when launchd already owns the process - Repository: openclaw/openclaw - Author: lynnzc - State: closed | merged: False - Link: https://github.com/openclaw/openclaw/pull/43639 ## Description (problem / solution / changelog) ## Summary Describe the problem and fix in 2–5 bullets: - Problem: SIGUSR1 restart could fall back to detached respawn on macOS when launchd env hints were missing, even if the process was actually launchd-managed. - Why it matters: detached fallback can leave an unmanaged old gateway process alive while launchd starts another one, causing duplicate Telegram long-polling and sustained 409 conflicts. - What changed: launchd supervision detection now has a runtime fallback that validates the active launchd job PID (`launchctl print gui/ / `) against `process.pid` before deciding to detached-respawn. - What did NOT change (scope boundary): no changes to launchd install/restart commands, channel polling logic, or gateway stop/start CLI semantics. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [x] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #43628 - Related #40932 - Related #41829 ## User-visible / Behavior Changes - macOS gateway restarts now correctly treat launchd-managed processes as supervised even when launchd hint env vars are absent, avoiding detached respawn fallback that can create duplicate gateway pollers. ## Security Impact (required) - New permissions/capabilities? (`No`) - Secrets/tokens handling changed? (`No`) - New/changed network calls? (`No`) - Command/tool execution surface changed? (`No`) - Data access scope changed? (`No`) - If any `Yes`, explain risk + mitigation: ## Repro + Verification ### Environment - OS: macOS (launchd path verified via unit test mocking) - Runtime/container: Node/Vitest - Model/provider: N/A - Integration/channel (if any): Telegram conflict scenario covered by restart-path fix - Relevant config (redacted): LaunchAgent label `ai.openclaw.gateway` ### Steps 1. Start from a macOS gateway restart path where launchd env hints are missing. 2. Trigger `restartGatewayProcessWithFreshPid()`. 3. Observe supervision mode resolution. ### Expected - Restart logic should classify the process as `supervised` when the loaded launchd runtime PID matches the current process. ### Actual - Verified by unit test: fallback `launchctl print` PID match returns `supervised`; non-match still uses detached spawn fallback. ## Evidence Attach at least one: - [x] Failing test/log before + passing after - [ ] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) Validation commands and outcomes: - `pnpm test src/infra/process-respawn.test.ts` (passed: 1 file, 15 tests) ## Human Verification (required) What you personally verified (not just CI), and how: - Verified scenarios: - Added/ran test for launchd runtime PID fallback when env hints are absent. - Added/ran test confirming detached spawn still occurs when runtime PID does not match current process. - Edge cases checked: - Existing launchd env-hint behavior still returns `supervised`. - Non-darwin behavior unaffected in existing test suite. - What you did **not** verify: - Live macOS LaunchAgent restart on a real host (this PR includes unit-test validation only). ## Review Conversations - [x] I replied to or resolved every bot review conversation I addressed in this PR. - [x] I left unresolved only the conversations that still need reviewer or maintainer judgment. If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers. ## Compatibility / Migration - Backward compatible? (`Yes`) - Config/env changes? (`No`) - Migration needed? (`No`) - If yes, exact upgrade steps: ## Failure Recovery (if this breaks) - How to disable/revert this change quickly: revert commit `5098effc0`. - Files/config to restore: `src/infra/supervisor-markers.ts`, `src/infra/process-respawn.test.ts`, `CHANGELOG.md`. - Known bad symptoms reviewers should watch for: gateway restart on macOS falling back to detached spawn while launchd-managed. ## Risks and Mitigations List only real risks for this PR. Add/remove entries as neede

openclaw2026-03-12 03:11:31

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#43628•Fetched 2026-04-08 00:16:40

View on GitHub

Comments

Participants

Timeline

Reactions

Author

michael-ruffolo

Participants

michael-ruffolo

Ryce

Timeline (top)

cross-referenced ×4labeled ×2commented ×1referenced ×1

Running openclaw configure (wizard) while the LaunchAgent is already managing a running gateway process leaves the original process alive. Both instances then attempt to long-poll Telegram simultaneously, producing continuous 409 Conflict errors that silently drop or misroute incoming messages for as long as both processes are alive.

Error Message

Error: getUpdates conflict: 409: Conflict: terminated by other getUpdates request; make sure that only one bot instance is running
No warning or error was surfaced to the user 2026-03-10T20:09:22-04:00 First 409 error logged (~22 min after wizard) High / P1 — All incoming Telegram messages silently dropped or misrouted for 31+ hours with no user-visible error. 5 bot accounts affected. User had no idea the system was broken until manually noticing missed messages.

Root Cause

Code Example

From gateway.err.log:


Total 409 errors: 10,996
First: 2026-03-10T16:09:22-04:00
Last:  2026-03-11T22:48:19-04:00

By day:
  2026-03-10: 2,280 errors
  2026-03-11: 8,716 errors

Sample (representative):
2026-03-11T20:00:09-04:00 [telegram] getUpdates conflict: Call to 'getUpdates' failed! (409: Conflict: terminated by other getUpdates request; make sure that only one bot instance is running); retrying in 30s.


From gateway.log (trigger event):

2026-03-10T19:47:12Z  openclaw configure wizard last ran (from config meta)
2026-03-10T20:09:22-04:00  First 409 error logged (~22 min after wizard)


Only one process running at time of filing: PID 39074, started ~23:29 EDT March 11 (after old process finally died).

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Summary

Steps to reproduce

Have OpenClaw running normally as a LaunchAgent (standard macOS setup)
Run openclaw configure (the setup wizard) while the gateway is live
Wizard completes and triggers a gateway restart
New gateway process starts, but the LaunchAgent-managed old process is NOT killed first
Both processes begin polling Telegram simultaneously with the same bot tokens
Result: continuous 409 Conflict errors begin ~20 minutes after wizard completes

Expected behavior

When the wizard or any restart flow triggers a new gateway process, the existing LaunchAgent-managed process should be fully stopped (SIGTERM + wait for confirmation) before the new instance begins Telegram polling. The openclaw gateway stop command should be called and awaited before spawning the new process.

Actual behavior

gateway.err.log logged 10,996 getUpdates 409 Conflict errors over ~31 continuous hours
~400 errors/hour, every hour, with no gaps
Error: getUpdates conflict: 409: Conflict: terminated by other getUpdates request; make sure that only one bot instance is running
Timeline: started March 10 ~4:09 PM EDT, resolved March 11 ~10:48 PM EDT (old process finally died on its own)
Trigger: openclaw configure ran at March 10 3:47 PM EDT (conflicts began ~22 minutes later)
All incoming Telegram messages had a ~50% chance of being silently dropped for 31 hours
No warning or error was surfaced to the user

OpenClaw version

2026.3.8

Operating system

macOS 15.6 (arm64, Mac Mini)

Install method

npm global (pnpm, /opt/homebrew/lib/node_modules/openclaw), LaunchAgent managed

Model

anthropic/claude-sonnet-4-6

Provider / routing chain

Anthropic

Config file / key location

No response

Additional provider/model setup details

No response

Logs, screenshots, and evidence

From gateway.err.log:


Total 409 errors: 10,996
First: 2026-03-10T16:09:22-04:00
Last:  2026-03-11T22:48:19-04:00

By day:
  2026-03-10: 2,280 errors
  2026-03-11: 8,716 errors

Sample (representative):
2026-03-11T20:00:09-04:00 [telegram] getUpdates conflict: Call to 'getUpdates' failed! (409: Conflict: terminated by other getUpdates request; make sure that only one bot instance is running); retrying in 30s.


From gateway.log (trigger event):

2026-03-10T19:47:12Z  openclaw configure wizard last ran (from config meta)
2026-03-10T20:09:22-04:00  First 409 error logged (~22 min after wizard)


Only one process running at time of filing: PID 39074, started ~23:29 EDT March 11 (after old process finally died).

Impact and severity

High / P1 — All incoming Telegram messages silently dropped or misrouted for 31+ hours with no user-visible error. 5 bot accounts affected. User had no idea the system was broken until manually noticing missed messages.

Workaround: openclaw gateway stop && sleep 3 && openclaw gateway start

Related open issues: #40932, #41829

Additional information

No response

extent analysis

Fix Plan

To resolve the issue, we need to ensure that the existing LaunchAgent-managed process is fully stopped before spawning a new process. We can achieve this by calling openclaw gateway stop and awaiting its completion before starting the new process.

Code Changes

We need to modify the openclaw configure wizard to stop the existing gateway process before starting a new one. Here's an example code snippet:

const { spawnSync } = require('child_process');

// ...

// Stop the existing gateway process
spawnSync('openclaw', ['gateway', 'stop'], { stdio: 'inherit' });

// Wait for the process to exit
setTimeout(() => {
  // Start the new gateway process
  spawnSync('openclaw', ['gateway', 'start'], { stdio: 'inherit' });
}, 3000); // Wait for 3 seconds to ensure the process has exited

Alternatively, you can use a more robust approach using child_process.execSync with a timeout:

const { execSync } = require('child_process');

// ...

// Stop the existing gateway process
execSync('openclaw gateway stop', { stdio: 'inherit', timeout: 30000 });

Configuration Changes

No configuration changes are required for this fix.

Verification

To verify that the fix worked, you can:

Run openclaw configure while the gateway is live.
Check the gateway.err.log file for any 409 Conflict errors.
Verify that only one process is running at a time using ps aux | grep openclaw.

If the fix is successful, you should not see any 409 Conflict errors, and only one process should be running at a time.

Extra Tips

To prevent similar issues in the future, consider implementing a more robust process management system, such as using a process manager like pm2 or systemd. Additionally, you can add logging and monitoring to detect and alert on similar issues.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #ssr #installation #tensor shape #autograd error #database connection #vector store #embedding generation #cache error #pipeline error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [BUG] macOS LaunchAgent + configure wizard creates duplicate gateway process, causing 30+ hour Telegram 409 polling conflict [2 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #43639: Gateway: prevent detached respawn when launchd already owns the process

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Changed files

PR #56324: fix(telegram): add per-token duplicate poller guard to prevent 409 conflicts

Description (problem / solution / changelog)

Summary

Context

Implementation

Test plan

Changed files

Code Example

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Config file / key location

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

Fix Plan

Code Changes

Configuration Changes

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING