openclaw - ✅(Solved) Fix claude-cli provider: session abort triggers gateway-wide SIGTERM (v2026.4.23) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#71662Fetched 2026-04-26 05:10:06
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×2referenced ×2closed ×1

When the main agent uses claude-cli/claude-opus-4-7 as its primary model, a session abort causes the entire gateway process to receive SIGTERM and restart. This was introduced in v2026.4.23.

Root Cause

When the main agent uses claude-cli/claude-opus-4-7 as its primary model, a session abort causes the entire gateway process to receive SIGTERM and restart. This was introduced in v2026.4.23.

Fix Action

Workaround

Switching agents.defaults.model.primary to any non-claude-cli provider (e.g. tuzi-sub-claude/claude-sonnet-4-6) completely eliminates the SIGTERM. Gateway survives restarts and session aborts without issue.

PR fix notes

PR #71681: fix(process): skip kill-tree group kill when child wasn't detached (#71662)

Description (problem / solution / changelog)

Closes #71662.

Bug

When the supervisor spawns a child with `detached: false` (service-managed runtime under launchd/systemd), the child shares the gateway's process group. On session abort or SIGKILL, `killProcessTree` unconditionally issued `process.kill(-pid, 'SIGTERM')` — which targets the entire process group (negative pid is POSIX group-kill semantics) and therefore SIGTERMs the gateway parent along with the child.

Reporter saw this on macOS LaunchAgent + KeepAlive=true: aborting a `claude-cli/claude-opus-4-7` session caused the gateway to receive SIGTERM, auto-restart, dropping all in-flight sessions. Switching the primary model to a non-cli provider eliminated it because non-cli paths don't go through this kill-tree call. Did NOT occur on Linux VPS where the gateway runs detached, because there `useDetached === true` and the child gets its own process group.

The trigger is exactly:

  • `src/process/supervisor/adapters/child.ts:45` — `useDetached = !isServiceManagedRuntime()` → false on launchd/systemd
  • `src/process/supervisor/adapters/child.ts:290` — `killProcessTree(pid)` called on session abort
  • `src/process/kill-tree.ts:49` — `process.kill(-pid, 'SIGTERM')` group-kill

Fix

`killProcessTree` now accepts an optional `opts.detached?: boolean`. When `detached: false`, `killProcessTreeUnix` skips the `-pid` group-kill and goes straight to direct-pid SIGTERM/SIGKILL. Default (`detached: true`) is preserved so all existing callers behave exactly as before.

`supervisor/adapters/child.ts:286` now threads the spawn-time `useDetached` flag into `killProcessTree`, so the kill path matches the spawn-time detachment decision.

Tests

  • new: `detached: false` skips group kill and uses direct pid SIGTERM only.
  • new: default behaviour (`detached: true`) still uses group kill (regression guard so the existing test case isn't accidentally weakened).
  • existing 4 tests unchanged. 6/6 pass locally.

Lint clean: `pnpm oxlint` — 0 warnings, 0 errors.

Out of scope

Other `killProcessTree` callers (mcp-stdio-transport, bash-tools.process, daemon/schtasks, etc.) keep the default group-kill behaviour because those processes are typically detached from the gateway. Only the `supervisor/adapters/child.ts` path threads `detached` through, since it's the path that knows whether the child was actually spawned detached. If other call sites need the same fix, they can adopt `opts.detached` incrementally.

🤖 generated with assistance from Claude Code Co-authored-by: HCL [email protected]

Changed files

  • src/process/kill-tree.test.ts (modified, +40/-0)
  • src/process/kill-tree.ts (modified, +28/-9)
  • src/process/supervisor/adapters/child.test.ts (modified, +37/-1)
  • src/process/supervisor/adapters/child.ts (modified, +10/-1)

Code Example

signal SIGTERM received
[claude-cli] live session close / AbortError

---

launchctl[11380] <- node[11370] <- node[10567] <- node[10566] <- bash[10542] <- zsh[10540]
RAW_BUFFERClick to expand / collapse

Bug Report

Version: 2026.4.23 (a979721) OS: macOS 25.4.0 (arm64) Provider: claude-cli/claude-opus-4-7

Summary

When the main agent uses claude-cli/claude-opus-4-7 as its primary model, a session abort causes the entire gateway process to receive SIGTERM and restart. This was introduced in v2026.4.23.

Reproduction

  1. Set agents.defaults.model.primary to claude-cli/claude-opus-4-7
  2. Start a conversation via Discord or WebUI
  3. Trigger a session abort (e.g. context compaction, long response, or manual interrupt)
  4. Observe: gateway receives SIGTERM, LaunchAgent restarts it (KeepAlive=true), all active sessions are dropped

Evidence

From ~/.openclaw/logs/gateway.err.log:

signal SIGTERM received
[claude-cli] live session close / AbortError

Multiple SIGTERM events observed at: 00:40:34, 00:43:44, 00:45:14, 00:45:59, 00:46:51 (Asia/Shanghai)

macOS unified log shows the kill chain originates from within the node process tree:

launchctl[11380] <- node[11370] <- node[10567] <- node[10566] <- bash[10542] <- zsh[10540]

Workaround

Switching agents.defaults.model.primary to any non-claude-cli provider (e.g. tuzi-sub-claude/claude-sonnet-4-6) completely eliminates the SIGTERM. Gateway survives restarts and session aborts without issue.

Expected Behavior

A claude-cli session abort should be handled gracefully within the provider layer and must not propagate as a SIGTERM to the gateway process.

Additional Context

  • The issue is specific to the claude-cli provider backend, not the transport layer (Discord, WebUI, Telegram all exhibit the same behavior)
  • LaunchAgent KeepAlive=true masks the issue by auto-restarting, but all in-flight sessions are lost on each restart
  • Did not occur on the same config running on Linux VPS prior to migration

extent analysis

TL;DR

Switching to a non-claude-cli provider, such as tuzi-sub-claude/claude-sonnet-4-6, can workaround the SIGTERM issue caused by session aborts with the claude-cli/claude-opus-4-7 model.

Guidance

  • Verify that the issue is specific to the claude-cli provider backend by testing with different providers, as the problem does not occur with non-claude-cli providers.
  • Check the ~/.openclaw/logs/gateway.err.log for signal SIGTERM received and [claude-cli] live session close / AbortError messages to confirm the issue.
  • Consider testing on a Linux environment to see if the issue is macOS-specific, as it did not occur on a Linux VPS prior to migration.
  • Review the node process tree in the macOS unified log to understand the kill chain and potential causes.

Example

No code snippet is provided as the issue seems to be related to the claude-cli provider configuration rather than code.

Notes

The workaround may not be suitable for all use cases, especially if the claude-cli/claude-opus-4-7 model is required. Further investigation into the claude-cli provider backend may be necessary to resolve the issue.

Recommendation

Apply the workaround by switching to a non-claude-cli provider, such as tuzi-sub-claude/claude-sonnet-4-6, as it completely eliminates the SIGTERM issue and allows the gateway to survive restarts and session aborts without losing in-flight sessions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix claude-cli provider: session abort triggers gateway-wide SIGTERM (v2026.4.23) [1 pull requests, 1 participants]