claude-code - 💡(How to fix) Fix Bug Report: Background subagent state sync — zombie 'running' agents cause stop hook infinite loop after all subagents have terminated [11 comments, 5 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#58637Fetched 2026-05-14 03:43:05
View on GitHub
Comments
11
Participants
5
Timeline
16
Reactions
0
Author
Timeline (top)
commented ×11labeled ×4cross-referenced ×1

When spawning multiple background subagents (≥6) in a single turn, the session registry fails to update subagent status from "running" to "completed" after all subagents have actually terminated. This causes the stop hook to block turn completion indefinitely, creating an infinite loop that consumes context until forced compaction.

Root Cause

The root cause appears to be in the subagent lifecycle management:

  1. Subagent processes terminate correctly (files written, no zombie processes)
  2. The session registry's in-memory state table does not receive (or drops) the "completed" status update for one or more subagent entries
  3. Stop hook reads the stale registry state, sees entries still marked "running"
  4. Main agent cannot force-clear or override the stale entries
  5. Loop continues until external intervention

Code Example

Stop hook feedback:
Background subagents are still running.
Use TaskOutput task_id="..." block=true to wait for their results before ending your turn.
RAW_BUFFERClick to expand / collapse

Summary

When spawning multiple background subagents (≥6) in a single turn, the session registry fails to update subagent status from "running" to "completed" after all subagents have actually terminated. This causes the stop hook to block turn completion indefinitely, creating an infinite loop that consumes context until forced compaction.

Environment

  • Claude Code version: 2.1.131
  • Session type: compacted continuation (but original spawning occurred before compaction)

Steps to Reproduce

  1. In a single turn, spawn 6+ background subagents using the Agent tool with run_in_background: true
  2. Wait for all subagents to complete (files written to disk, processes exited)
  3. Main agent attempts to end turn

Expected Behavior

Stop hook recognizes all subagents have terminated, allows turn to end cleanly.

Actual Behavior

Stop hook repeatedly blocks turn ending with:

Stop hook feedback:
Background subagents are still running.
Use TaskOutput task_id="..." block=true to wait for their results before ending your turn.

This repeats indefinitely even after:

  • All subagent processes have exited
  • All output files have been written
  • Subagent metadata files exist on disk

Evidence from Affected Session

  • Subagents spawned: 9 (confirmed by 9 agent-*.meta.json files in session subagent directory)
  • Stop hook blocks: 154 occurrences over ~4.5 hours (first at line 265, last at line 1199 of the JSONL transcript)
  • Time window: 2026-05-13T05:07:07 → 2026-05-13T09:27:04 UTC
  • Result: Context exhaustion → forced /compact, session lost mid-task context

Analysis

The root cause appears to be in the subagent lifecycle management:

  1. Subagent processes terminate correctly (files written, no zombie processes)
  2. The session registry's in-memory state table does not receive (or drops) the "completed" status update for one or more subagent entries
  3. Stop hook reads the stale registry state, sees entries still marked "running"
  4. Main agent cannot force-clear or override the stale entries
  5. Loop continues until external intervention

Additional Impact

After compaction, the stale "running" entries persist into the continued session, causing the same stop hook to fire even in the new session where no background agents were spawned.

Suggested Fix

  1. Add a timeout mechanism — if a registered background subagent has not produced output for N seconds, auto-mark as completed/stale
  2. Allow main agent to acknowledge-and-dismiss stale entries via TaskOutput or a new TaskStop variant
  3. Ensure subagent termination events are synchronously propagated to the session registry before the subagent process exits

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING