claude-code - 💡(How to fix) Fix macOS: zsh -c -l Bash-tool wrappers leak post-completion in run_init_scripts → execif, idle Ss with no CPU [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#54993Fetched 2026-05-01 05:49:03
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
labeled ×3

On macOS with a nix-managed zsh, the Bash-tool wrapper (zsh -c -l 'setopt … && eval <cmd> < /dev/null && pwd -P >| /tmp/claude-XXXX-cwd') occasionally fails to exit after the eval completes. The shell sits idle (Ss, ~1 MB RSS, 0 CPU) with the task-output file already written and closed, and the call stack stuck inside zsh's run_init_scripts. Across a single multi-hour session today I accumulated 61 of these orphans, ~25 of them parented to the still-live Claude PID (so this is not "session-end" leakage — the parent claude is fine, the children just won't exit).

Calling this a fourth distinct failure mode of the wrapper template, alongside #50191 (concurrent-dispatch accumulation), #33947 (session-end MCP/subagent orphans), and #50865 (re-triggering after session end). This one is post-completion non-exit on a single-session interactive workflow. No subagents involved, no concurrency cap needed, no respawn loop, no pipe break — the eval finished, the output landed, the shell just doesn't terminate.

Root Cause

The one environmental difference I can see between leaking and non-leaking calls is stdin: leaked shells have a unix socket on FD 0 (inherited from claude); my manual test has a TTY. I haven't confirmed this as causal because I can't easily reproduce a unix-socket stdin from outside the harness.

Fix Action

Fix / Workaround

Calling this a fourth distinct failure mode of the wrapper template, alongside #50191 (concurrent-dispatch accumulation), #33947 (session-end MCP/subagent orphans), and #50865 (re-triggering after session end). This one is post-completion non-exit on a single-session interactive workflow. No subagents involved, no concurrency cap needed, no respawn loop, no pipe break — the eval finished, the output landed, the shell just doesn't terminate.

$ sample 13389 1
Call graph:
    882 Thread_xxxx  DispatchQueue_1: com.apple.main-thread (serial)
      882 start  (in dyld) + 3056
        882 zsh_main  (in zsh) + 955
          882 run_init_scripts  (in zsh) + 250
            882 source  (in zsh) + 758
              882 loop  (in zsh) + 847
                882 execode  (in zsh) + 202
                  882 execlist  (in zsh) + 1999
                    882 ???
                      882 ???
                        882 ???
                          882 execif  (in zsh) + 430
                            882 execlist  (in zsh) + 1999
  • #50191 — concurrent-dispatch wrapper accumulation (during subagent storm). Same wrapper template; different lifecycle (active CPU consumption during legitimate work).
  • #33947 — macOS PPID=1 orphan accumulation, MCP/subagent processes on session end. Same OS, same orphan pattern; different fork source.
  • #50865 — orphaned background shells re-triggering commands in new sessions. Different failure (unintended re-execution); same wrapper persistence underneath.
  • #43944 — backgrounded run_in_background: true processes surviving session exit. Different scope (post-session intentional-bg vs mid-session synchronous).

Code Example

$ sample 13389 1
Call graph:
    882 Thread_xxxx  DispatchQueue_1: com.apple.main-thread (serial)
      882 start  (in dyld) + 3056
        882 zsh_main  (in zsh) + 955
          882 run_init_scripts  (in zsh) + 250
            882 source  (in zsh) + 758
              882 loop  (in zsh) + 847
                882 execode  (in zsh) + 202
                  882 execlist  (in zsh) + 1999
                    882 ???
                      882 ???
                        882 ???
                          882 execif  (in zsh) + 430
                            882 execlist  (in zsh) + 1999

---

$ lsof -p 13389 | grep -vE "REG.*nix-profile|libsystem"
zsh 13389 markus cwd  DIR    /Users/markus/Code/BYTEPOETS/bon26/bonelio26-angular
zsh 13389 markus  0u  unix   ->  (peer of claude.exe parent)
zsh 13389 markus  1w  REG    /private/tmp/claude-501/<session>/tasks/b8up4vpyh.output
zsh 13389 markus  2w  REG    /private/tmp/claude-501/<session>/tasks/b8up4vpyh.output
zsh 13389 markus 10r  REG    /private/etc/zprofile  ← still open
RAW_BUFFERClick to expand / collapse

Summary

On macOS with a nix-managed zsh, the Bash-tool wrapper (zsh -c -l 'setopt … && eval <cmd> < /dev/null && pwd -P >| /tmp/claude-XXXX-cwd') occasionally fails to exit after the eval completes. The shell sits idle (Ss, ~1 MB RSS, 0 CPU) with the task-output file already written and closed, and the call stack stuck inside zsh's run_init_scripts. Across a single multi-hour session today I accumulated 61 of these orphans, ~25 of them parented to the still-live Claude PID (so this is not "session-end" leakage — the parent claude is fine, the children just won't exit).

Calling this a fourth distinct failure mode of the wrapper template, alongside #50191 (concurrent-dispatch accumulation), #33947 (session-end MCP/subagent orphans), and #50865 (re-triggering after session end). This one is post-completion non-exit on a single-session interactive workflow. No subagents involved, no concurrency cap needed, no respawn loop, no pipe break — the eval finished, the output landed, the shell just doesn't terminate.

Environment

  • Claude Code version: 2.x (current, this session)
  • OS: macOS 15.7.5 (Darwin 24.6.0) on Intel iMac
  • Shell: zsh 5.9 from nix-profile (/nix/store/n99xi7hrb5wyqk70n6dvlryi8m6iyhhx-zsh-5.9/bin/zsh)
  • Login flag: harness uses zsh -c -l (sources /etc/zprofile, ~/.zshenv, ~/.zprofile)
  • ~/.zshenv is a symlink into the home-manager nix store; sources hm-session-vars.sh and sets ZDOTDIR
  • ~/.zshrc has compinit, starship init zsh, and direnv hook zsh evals — but none of these run for -c -l (non-interactive)
  • /etc/zprofile contains only the standard macOS path_helper eval

Reproduction signal (intermittent — couldn't capture deterministically)

I cannot trigger the leak on demand with a simple zsh -c -l 'true' from my own shell — that consistently exits in <1 s. Manual repro of the exact harness pattern (zsh -c -l "setopt … && eval 'echo HI' < /dev/null && pwd -P >| /tmp/test-leak-$$") also exits cleanly. The leak only happens when launched by Claude Code's Bash tool, and only intermittently — across a long session, ~30% of calls leak; the other ~70% complete and exit normally.

The one environmental difference I can see between leaking and non-leaking calls is stdin: leaked shells have a unix socket on FD 0 (inherited from claude); my manual test has a TTY. I haven't confirmed this as causal because I can't easily reproduce a unix-socket stdin from outside the harness.

Evidence — call stack of a leaked shell (post-completion, idle)

A leaked shell from this session, PID 13389, was launched ~3h13m before I captured this. Its task-output file has been written and closed (1 byte, the expected gh api result is in there). The shell has no children, no zombies, state is Ss, no wchan. Yet it never exits.

$ sample 13389 1
Call graph:
    882 Thread_xxxx  DispatchQueue_1: com.apple.main-thread (serial)
      882 start  (in dyld) + 3056
        882 zsh_main  (in zsh) + 955
          882 run_init_scripts  (in zsh) + 250
            882 source  (in zsh) + 758
              882 loop  (in zsh) + 847
                882 execode  (in zsh) + 202
                  882 execlist  (in zsh) + 1999
                    882 ???
                      882 ???
                        882 ???
                          882 execif  (in zsh) + 430
                            882 execlist  (in zsh) + 1999

zsh is paused inside execif (an if block) during init-script execution. All 882 sample frames have the identical stack — it's not making progress.

$ lsof -p 13389 | grep -vE "REG.*nix-profile|libsystem"
zsh 13389 markus cwd  DIR    /Users/markus/Code/BYTEPOETS/bon26/bonelio26-angular
zsh 13389 markus  0u  unix   ->  (peer of claude.exe parent)
zsh 13389 markus  1w  REG    /private/tmp/claude-501/<session>/tasks/b8up4vpyh.output
zsh 13389 markus  2w  REG    /private/tmp/claude-501/<session>/tasks/b8up4vpyh.output
zsh 13389 markus 10r  REG    /private/etc/zprofile  ← still open

FD 10 still holds /etc/zprofile open read-only — the rc-source loop never closed it. Stdin is a unix socket inherited from claude (the inner eval's < /dev/null only redirects the inner command's stdin, not the outer zsh's).

Verified: cleanup is safe

pkill -f 'claude-.*-cwd' reaps every leaked wrapper without affecting subsequent Bash-tool calls (each new call spawns a fresh shell). After a single pkill in a busy session today: leak count dropped from 61 → 4 (the 4 are currently-active calls including the one running the pkill). The harness immediately recovered.

Suggested fixes (in rough impact/complexity order)

  1. Drop -l from the Bash-tool wrapper. Use zsh -c '<cmd>' instead of zsh -c -l '<cmd>'. Most user environment lives in .zshenv (sourced regardless of -l); -l only adds /etc/zprofile, ~/.zprofile, and the corresponding logout hooks. On macOS the only thing in the system zprofile is path_helper, which is also typically already in the user's zshenv via the nix profile setup. Skipping -l would (a) sidestep the file/path most strongly correlated with the hang and (b) shave ~10–15 ms per call. Closes the most evidence-supported leak path with one flag change.
  2. Time-bound the wrapper. After the inner eval has completed and the cwd file has been written, the wrapper should exit within e.g. 100 ms. If not, the Bash tool should SIGTERM it (and SIGKILL after a short grace period). The output file can be read independently — the wrapper's job is done as soon as the output is flushed.
  3. < /dev/null on the outer shell, not just the inner eval. The current pattern's < /dev/null only redirects the inner eval's stdin; the outer zsh inherits whatever stdin claude passes (a unix socket, in our case). Redirecting the outer shell's stdin from /dev/null removes the inherited socket as a possible factor.
  4. Periodic harness-side cleanup. Even with the above, an out-of-band reaper (e.g., on every new Bash-tool call, kill any wrapper in this session that's >60s old and whose output file is closed) would prevent unbounded accumulation when an upstream bug recurs.

Related issues

  • #50191 — concurrent-dispatch wrapper accumulation (during subagent storm). Same wrapper template; different lifecycle (active CPU consumption during legitimate work).
  • #33947 — macOS PPID=1 orphan accumulation, MCP/subagent processes on session end. Same OS, same orphan pattern; different fork source.
  • #50865 — orphaned background shells re-triggering commands in new sessions. Different failure (unintended re-execution); same wrapper persistence underneath.
  • #43944 — backgrounded run_in_background: true processes surviving session exit. Different scope (post-session intentional-bg vs mid-session synchronous).

This report fills in a fourth distinct failure mode of the wrapper template — synchronous calls leaving orphans behind even when the parent claude is alive and well.

Happy to attach on request

  • Full ps -ax output of the 61-shell leak from today's session.
  • Full sample output of multiple leaked PIDs (call stacks identical across all I sampled — same execif frame).
  • lsof of representative leaked PIDs.
  • Reproduction transcript of the harness pattern manually exiting cleanly (i.e. negative repro on bare zsh).

extent analysis

TL;DR

The most likely fix for the issue of Bash-tool wrappers occasionally failing to exit after completion is to drop the -l flag from the zsh command.

Guidance

  • Dropping the -l flag from the Bash-tool wrapper (zsh -c '<cmd>' instead of zsh -c -l '<cmd>') may resolve the issue by sidestepping the file/path most strongly correlated with the hang.
  • Implementing a time-bound wrapper that exits within a certain time frame (e.g., 100 ms) after the inner eval has completed and the cwd file has been written can also help prevent the issue.
  • Redirecting the outer shell's stdin from /dev/null may remove the inherited socket as a possible factor contributing to the issue.
  • Periodic harness-side cleanup can prevent unbounded accumulation of leaked wrappers.

Example

No code snippet is provided as the issue is more related to the configuration and flags used with the zsh command.

Notes

The issue seems to be specific to the combination of macOS, zsh, and the Bash-tool wrapper. The suggested fixes are based on the analysis of the provided information and may not be applicable in all scenarios.

Recommendation

Apply the workaround of dropping the -l flag from the Bash-tool wrapper, as it is a simple and low-risk change that may resolve the issue. This change can help sidestep the file/path most strongly correlated with the hang and shave ~10-15 ms per call.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix macOS: zsh -c -l Bash-tool wrappers leak post-completion in run_init_scripts → execif, idle Ss with no CPU [1 participants]