claude-code - 💡(How to fix) Fix Background bash children survive session close; task-output files have no size cap (4.86 TB runaway on Windows) [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#51760Fetched 2026-04-22 07:53:35
View on GitHub
Comments
2
Participants
2
Timeline
7
Reactions
0
Timeline (top)
labeled ×4commented ×2cross-referenced ×1

On Claude Code 2.1.116 (Windows), a background bash process spawned inside a session continued writing to its task-output file for 2 hours 40 minutes after the parent session cleanly ended. The file grew to 4.86 TB (5,215,881,408,512 bytes) before being caught manually. Machine disk was nearly filled. Full forensic writeup available on request.

Two separate harness gaps produced this incident. Either alone would have limited the damage.

Error Message

  • Return a tool_use error: "Output exceeded {cap}MB, bash process terminated. Consider bounding output or writing to a file."
  • Pre/Post-tool guardrails hooks to block obvious streaming patterns and warn on oversized outputs

Root Cause

Root cause 1: bash children are not killed on session end (Windows)

RAW_BUFFERClick to expand / collapse

Background bash child processes survive session close; task-output files have no size cap (resulted in 4.86 TB runaway on Windows)

Summary

On Claude Code 2.1.116 (Windows), a background bash process spawned inside a session continued writing to its task-output file for 2 hours 40 minutes after the parent session cleanly ended. The file grew to 4.86 TB (5,215,881,408,512 bytes) before being caught manually. Machine disk was nearly filled. Full forensic writeup available on request.

Two separate harness gaps produced this incident. Either alone would have limited the damage.

Environment

  • Claude Code version: 2.1.116
  • OS: Windows 11 Home 10.0.26200
  • Shell: bash (git-bash) via CLAUDE_CODE_USE_POWERSHELL_TOOL=1 config (both Bash and PowerShell tools available)
  • Disk: 8 TB C: drive

What happened

  1. A Claude Code session opened in a project at 10:21 AM local.
  2. ~20 minutes in (10:41 AM), a task-output file bl8di8hm3.output was created in the session's tasks folder: %TEMP%\claude\<workspace>\<session-uuid>\tasks\.
  3. The file grew continuously at ~300 MB/sec sustained write — every line was a single y\n, consistent with a yes-style streaming loop.
  4. The session ended cleanly at 12:13 PM local (normal shutdown, assistant completed its last message, no crash).
  5. The file continued growing for 2 hours 40 minutes after the session closed, reaching 4.86 TB by 2:53 PM.
  6. Writes eventually stopped (likely disk pressure; unclear).
  7. Noticed 4 TB of disk had vanished and investigated.

Root cause 1: bash children are not killed on session end (Windows)

On Windows, Claude Code does not place child bash processes in a Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE. When the parent CC session exits, child processes are orphaned to the OS and continue running until they finish naturally or are killed externally.

Evidence: 2h 40min of file growth after the only session that could have spawned the process had closed. No other running session had a reason to write to that path (the directory name embeds the source session's UUID).

Proposed fix

Windows: CreateProcess with a Job Object handle configured with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE. Children die when the handle is released at session exit.

Unix (for parity; unclear if the same bug exists there): setpgid(0, 0) in the child, kill(-pgid, SIGTERM) then SIGKILL at session close if still alive.

Root cause 2: task-output files have no upper bound

The task-output file (<temp>/claude/<workspace>/<session>/tasks/<task-id>.output) has no size cap. The harness tees child stdout to the file indefinitely. A single task was allowed to write 4.86 TB to one file.

Proposed fix

Configurable cap with a sane default (500 MB was the number that felt right during post-incident review, though that's a product call). When the cap is hit:

  • Truncate the file (or stop appending)
  • Terminate the child process
  • Return a tool_use error: "Output exceeded {cap}MB, bash process terminated. Consider bounding output or writing to a file."

Root cause 3 (observability, not user-facing but cripples triage)

The task ID bl8di8hm3 appears in zero session JSONLs — not the parent session's jsonl, not any subagent transcripts (which CC stores at <session>/subagents/agent-*.jsonl), not any other same-workspace sessions, not any other workspace's session across the machine.

A sibling task bqnml14w4 (34 KB, created a minute after session start) has the same property: its output file exists on disk, but no tool_use referencing it appears in any transcript.

To rule out "we just didn't search in enough places," a second forensic sweep was performed across 19 additional local channels: CC internal state (shell-snapshots, statsig, telemetry, logs, todos, tasks, sessions, session-env, transcripts, plans, paste-cache, handoffs, cache, backups, file-history, history.jsonl), Windows event logs (Application, System), shell histories (PSReadLine, bash_history), Windows Prefetch, MCP server logs, plugin hook logs, SQLite indexes, system-wide ./ files, WER crash dumps, registry UserAssist run records, and live process state. Every pre-incident search returned zero hits for bl8di8hm3. The only references anywhere on the machine are from post-22:36 local, which is this investigation itself. Full matrix in the post-mortem document, Appendix C.

Two additional side-findings from that sweep that strengthen this root cause:

  • ~/.claude/session-env/8f1c111e-.../ contains only sessionstart-hook-1.sh; 156 of 157 comparable session-env directories also have sessionstart-hook-2.sh. Something aborted between hook 1 and hook 2 for this specific session, OR hook 2 was never registered.
  • ~/.claude/tasks/8f1c111e/ does not exist; that state directory exists for 157 other sessions but not this one. The task state tracking subsystem never registered the runaway session.

Both suggest the session was partially registered by the CC harness. That's consistent with the untraceable spawn: whatever subsystem wrote to the tasks folder bypassed the channels the harness normally uses to log work.

This means some channel writes to <session>/tasks/<task-id>.output without the harness recording the spawn in the session transcript. Candidates include (unranked):

  • Internal harness retries on tool_use failure
  • Crash-recovery paths re-spawning tool calls
  • MCP server tools that delegate to bash
  • Plugin hooks that write into the tasks directory
  • Pre-flight commands (SessionStart, UserPromptSubmit) not captured in the transcript
  • A harness path that activates only when sessionstart-hook-2.sh is absent or session registration is partial

Could not determine which after exhaustive dual-session forensics. The inability to answer "what process wrote this file" for a 5 TB incident is itself a tier-one observability bug.

Proposed fix

Every subprocess whose output is captured in the tasks folder should produce a TaskSpawn entry in the session transcript with: task ID, spawn source (tool_use ID, hook name, harness-internal path), command line, and timestamp. Without this, incidents of this class will remain un-diagnosable.

Second, make session registration atomic — if it fails partway, the session should fail to start rather than silently proceed without state-tracking directories. The partial-registration pattern above is dangerous because it puts the session into a state where the harness can still spawn work but can no longer track it.

Steps to reproduce (not tested, but should suffice)

  1. In a Claude Code session on Windows, run a bash command that streams indefinitely, e.g. yes > /dev/null (wait, that's fine — try yes with no redirection, piped to something that closes its stdin quickly, or just let yes write to its own default stdout)
  2. Close the session via normal exit
  3. Observe the task-output file at %TEMP%\claude\<workspace>\<session>\tasks\<task-id>.output
  4. Expected: file size capped and process terminated on session close
  5. Actual: file grows indefinitely; process continues after session exit

Impact

In this specific incident: ~4 TB of disk vanished, multiple downstream sessions degraded because of the full disk, ~1 hour of triage across parallel sessions to diagnose. No data loss, no permanent workload failure, but a close call for "full disk takes the machine down mid-session."

Scale concern: any Claude Code user running a session that happens to spawn a streaming-output process and then closes the session is subject to this. The 4.86 TB size is remarkable only because the user happened to have 8 TB of disk; on a typical 1 TB laptop, this would have filled the disk in ~30 minutes and the machine would have hung unrelated workloads.

What was already done locally

  • Post-mortem written
  • Compensating controls filed locally:
    • Pre/Post-tool guardrails hooks to block obvious streaming patterns and warn on oversized outputs
    • External watchdog in the session-indexer tool that scans %TEMP%/claude/*/tasks/ for orphan task-outputs and surfaces them
  • These are compensating controls, not fixes. The root cause fixes are the three items above.

extent analysis

TL;DR

To prevent bash child processes from surviving session close and task-output files from growing indefinitely, implement a Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE on Windows and consider adding a configurable size cap for task-output files.

Guidance

  • Implement a Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE when creating child bash processes on Windows to ensure they are killed when the parent session exits.
  • Add a configurable size cap for task-output files to prevent them from growing indefinitely, with a proposed default of 500 MB.
  • When the size cap is hit, truncate the file, terminate the child process, and return a tool-use error to notify the user.
  • Improve observability by adding a TaskSpawn entry in the session transcript for each subprocess whose output is captured in the tasks folder, including task ID, spawn source, command line, and timestamp.

Example

// Create a Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE
HANDLE jobHandle = CreateJobObject(NULL, NULL);
JOBOBJECT_ASSOCIATE_COMPLETION_PORT jobObject = {0};
jobObject.CompletionPort = NULL;
jobObject.CompletionKey = NULL;
SetInformationJobObject(jobHandle, JobObjectAssociateCompletionPortInformation, &jobObject, sizeof(jobObject));

// Configure the Job Object to kill processes on job close
JOBOBJECT_BASIC_LIMIT_INFORMATION jobLimitInfo = {0};
jobLimitInfo.LimitFlags = JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE;
SetInformationJobObject(jobHandle, JobObjectBasicLimitInformation, &jobLimitInfo, sizeof(jobLimitInfo));

Notes

The proposed fixes address the root causes of the issue, but additional testing and validation are necessary to ensure they work as expected in all scenarios.

Recommendation

Apply the proposed fixes, including implementing a Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE and adding a configurable size cap for task-output files, to prevent similar incidents in the future.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Background bash children survive session close; task-output files have no size cap (4.86 TB runaway on Windows) [2 comments, 2 participants]