claude-code - 💡(How to fix) Fix [BUG] claude rm/stop/reap SIGKILLs background session tree without SIGTERM grace, orphaning git index.lock and similar

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When a background session is torn down (claude rm <id>, claude stop <id>, idle reap, Agents-view delete), the session's process tree receives SIGKILL immediately, with no prior SIGTERM and no grace period. Any cleanup that relies on a catchable signal — atexit, signal.SIGTERM/SIGHUP/SIGINT handlers, language runtime shutdown — does not run.

This was raised before in #37127 and auto-closed as a duplicate of #31646. #31646 is closed NOT_PLANNED and is actually about a different topic (MCP stdio servers reported as failed on exit). The auto-dedupe was wrong; the underlying SIGTERM-vs-SIGKILL behavior on session teardown is still live and unaddressed. #37127 is now locked, so I can't reopen it — filing fresh with a concrete repro and a real-world consequence.

Error Message

error: Unable to create '<gitdir>/index.lock': File exists. Another git process seems to be running in this repository, e.g. an editor opened by 'git commit'. ... remove the file manually to continue.

Root Cause

git is the canonical example because its locking model makes the bug user-visible:

Fix Action

Fix / Workaround

In our codebase the workaround burden has been real: we built a preflight lock-recovery guard plus a SessionEnd hook that proactively sweeps stale index.lock / HEAD.lock / ORIG_HEAD.lock / refs/**/*.lock before every git write, and we setsid-detach our cleanup grandchild so the group-targeted SIGKILL misses it. That's effective but ~150 lines of code purely to recover from the missing grace period. SIGTERM-with-grace would let us delete it.

  • #37127 — same issue, closed-as-duplicate against an unrelated MCP-server bug; now locked.
  • #31646 — the actually-unrelated MCP-server issue #37127 was merged into, closed NOT_PLANNED.
  • #33979 — meta tracker "Built-in Session Manager (claude sessions) — Consolidates 14+ Orphan/Process Issues". This bug fits there.
  • #54626 — "Scheduled tasks and background sub-agents leak processes/UI state without cleanup". Same root cause.
  • #50865 — "Orphaned background shells from prior sessions persist". Same root cause.

Code Example

error: Unable to create '<gitdir>/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. ... remove the file manually to continue.
RAW_BUFFERClick to expand / collapse

Summary

When a background session is torn down (claude rm <id>, claude stop <id>, idle reap, Agents-view delete), the session's process tree receives SIGKILL immediately, with no prior SIGTERM and no grace period. Any cleanup that relies on a catchable signal — atexit, signal.SIGTERM/SIGHUP/SIGINT handlers, language runtime shutdown — does not run.

This was raised before in #37127 and auto-closed as a duplicate of #31646. #31646 is closed NOT_PLANNED and is actually about a different topic (MCP stdio servers reported as failed on exit). The auto-dedupe was wrong; the underlying SIGTERM-vs-SIGKILL behavior on session teardown is still live and unaddressed. #37127 is now locked, so I can't reopen it — filing fresh with a concrete repro and a real-world consequence.

Concrete repro / consequence: orphaned git index.lock

git is the canonical example because its locking model makes the bug user-visible:

  1. Inside a background session, start any git write that holds index.lock briefly — git rebase, git merge, git checkout on a large tree.
  2. Before it returns, claude rm (or claude stop) the session.
  3. Observe the resulting .git/index.lock (or .git/worktrees/<wt>/index.lock for worktrees) — the lock file persists with no owning process.
  4. The next git write fails with:
error: Unable to create '<gitdir>/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. ... remove the file manually to continue.

Why git is the canary: git's lockfile machinery (tempfile.c) installs atexit plus signal handlers for SIGTERM/SIGHUP/SIGINT/SIGQUIT/SIGPIPE that remove the lock on the way out. So under any catchable signal — kill <pid>, terminal close (SIGHUP), Ctrl-C, internal errors — git self-cleans and there is no orphan. The orphan only appears when git is hit by an uncatchable signal: SIGKILL, OOM-killer, or host reboot. Empirically, after every claude rm/stop/reap mid-write we see a stale lock; manually kill -TERM-ing the same pgid does not produce one. That's a clean fingerprint that teardown is delivering SIGKILL, not SIGTERM.

Evidence (same as #37127, restated)

  • Python script in a bg session with atexit.register(...) and explicit handlers for SIGTERM/SIGHUP/SIGINT: none fire after claude rm. No log lines, no cleanup files written.
  • Same script, sent kill -TERM -<pgid> manually: handlers fire, atexit runs, cleanup completes.
  • Exit-code shape and timing match SIGKILL: instant termination, no chance to drain.

Expected behavior

Standard two-stage termination, the same shape Docker, Kubernetes, systemd, and tmux use:

  1. Send SIGTERM (or SIGHUP) to the session's process group.
  2. Wait a grace period.
  3. Send SIGKILL only if the process tree hasn't exited.

A configurable grace period (env var, settings.json, or per-command flag) would be ideal. A reasonable default like 5s would resolve nearly every real-world cleanup case I've seen, including git's, without making interactive teardown feel slow.

Why this matters beyond git

This affects every long-lived background process spawned inside a Claude bg session that has any cleanup responsibility:

  • Database connections / connection pools that need to be returned cleanly.
  • WAL-style writers that flush on shutdown.
  • Tempfile/lockfile owners (git is one example; flock/fcntl writers in general).
  • Writers buffering to stdout/stderr — SIGKILL drops in-flight buffers.
  • Anything using defer (Go), try/finally (Python), Drop (Rust) that requires the runtime to unwind.

In our codebase the workaround burden has been real: we built a preflight lock-recovery guard plus a SessionEnd hook that proactively sweeps stale index.lock / HEAD.lock / ORIG_HEAD.lock / refs/**/*.lock before every git write, and we setsid-detach our cleanup grandchild so the group-targeted SIGKILL misses it. That's effective but ~150 lines of code purely to recover from the missing grace period. SIGTERM-with-grace would let us delete it.

Related

  • #37127 — same issue, closed-as-duplicate against an unrelated MCP-server bug; now locked.
  • #31646 — the actually-unrelated MCP-server issue #37127 was merged into, closed NOT_PLANNED.
  • #33979 — meta tracker "Built-in Session Manager (claude sessions) — Consolidates 14+ Orphan/Process Issues". This bug fits there.
  • #54626 — "Scheduled tasks and background sub-agents leak processes/UI state without cleanup". Same root cause.
  • #50865 — "Orphaned background shells from prior sessions persist". Same root cause.

What I'd like

Either re-open #37127 (locked, so probably not possible), or treat this as the canonical bug for the SIGTERM-grace-SIGKILL behavior on session teardown. The fix is small and well-precedented (Docker/k8s/systemd all do it). Happy to test any beta build that ships it.

Environment

  • Claude Code CLI, recent build.
  • Linux x86_64.
  • Sessions launched via claude --bg, torn down via claude rm <short_id>.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Standard two-stage termination, the same shape Docker, Kubernetes, systemd, and tmux use:

  1. Send SIGTERM (or SIGHUP) to the session's process group.
  2. Wait a grace period.
  3. Send SIGKILL only if the process tree hasn't exited.

A configurable grace period (env var, settings.json, or per-command flag) would be ideal. A reasonable default like 5s would resolve nearly every real-world cleanup case I've seen, including git's, without making interactive teardown feel slow.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING