claude-code - 💡(How to fix) Fix Background agents silently die on session pause/resume — no completion notification, no work recovery

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Background agents (Agent tool invoked with run_in_background: true) are terminated when the Claude Code session is paused (e.g. closing the laptop, OS sleep, long idle). When the session resumes hours later, those agents are gone from the UI, but no completion notification ever arrives for them, and any uncommitted work in their isolated worktrees is permanently lost. The model has no signal that the agents died — they look indistinguishable from successfully-completed agents.

Root Cause

  • Token waste — agents that almost-finished get nuked, and follow-up agents have to redo work
  • Wall-clock waste — operators discover the loss hours later after dispatching dependent batches
  • Decision making — the model believes the work shipped and proceeds; downstream batches reference work that doesn't actually exist on origin
  • Trust — operators stop trusting background-agent reports because they can't tell live from dead

Fix Action

Fix / Workaround

  1. Dispatch N background agents via the Agent tool with run_in_background: true and isolation: "worktree".
  2. Close the laptop / let the OS sleep / leave the Claude Code session idle for several hours.
  3. Reopen / resume the Claude Code session the next day.
  4. Check the agent panel — agents are gone (no longer listed as running).
  5. Check task notifications — NO completion notifications arrive for the dispatched agents.
  6. Check the agents' isolated worktrees — partial uncommitted work; some may have committed locally but never pushed.
  7. The model continues as if the agents finished successfully and dispatches the next batch, only to discover via manual investigation (git ls-remote + worktree inspection) that nothing landed on origin.
  • Pause window 1 (overnight gap ~16 hours): 3 background agents (finishers for stalled PRs) dispatched at 18:07 IST. Session resumed at 14:51 IST the next day. All 3 agents gone; their worktrees in mid-rebase state, no completion notification ever fired. Required manual diagnosis to figure out what landed vs what was lost.

  • Pause window 2 (overnight gap ~16 hours): A batch of 6 background agents dispatched at 20:44 IST. Session resumed at 12:25 IST the next day. ZERO of the 6 agents produced a PR on origin. ZERO completion notifications fired. All 6 worktrees were in stale partial-work states.

  • Token waste — agents that almost-finished get nuked, and follow-up agents have to redo work

  • Wall-clock waste — operators discover the loss hours later after dispatching dependent batches

  • Decision making — the model believes the work shipped and proceeds; downstream batches reference work that doesn't actually exist on origin

  • Trust — operators stop trusting background-agent reports because they can't tell live from dead

RAW_BUFFERClick to expand / collapse

Summary

Background agents (Agent tool invoked with run_in_background: true) are terminated when the Claude Code session is paused (e.g. closing the laptop, OS sleep, long idle). When the session resumes hours later, those agents are gone from the UI, but no completion notification ever arrives for them, and any uncommitted work in their isolated worktrees is permanently lost. The model has no signal that the agents died — they look indistinguishable from successfully-completed agents.

Repro

  1. Dispatch N background agents via the Agent tool with run_in_background: true and isolation: "worktree".
  2. Close the laptop / let the OS sleep / leave the Claude Code session idle for several hours.
  3. Reopen / resume the Claude Code session the next day.
  4. Check the agent panel — agents are gone (no longer listed as running).
  5. Check task notifications — NO completion notifications arrive for the dispatched agents.
  6. Check the agents' isolated worktrees — partial uncommitted work; some may have committed locally but never pushed.
  7. The model continues as if the agents finished successfully and dispatches the next batch, only to discover via manual investigation (git ls-remote + worktree inspection) that nothing landed on origin.

Evidence from a recent multi-day workflow

  • Pause window 1 (overnight gap ~16 hours): 3 background agents (finishers for stalled PRs) dispatched at 18:07 IST. Session resumed at 14:51 IST the next day. All 3 agents gone; their worktrees in mid-rebase state, no completion notification ever fired. Required manual diagnosis to figure out what landed vs what was lost.
  • Pause window 2 (overnight gap ~16 hours): A batch of 6 background agents dispatched at 20:44 IST. Session resumed at 12:25 IST the next day. ZERO of the 6 agents produced a PR on origin. ZERO completion notifications fired. All 6 worktrees were in stale partial-work states.

Each loss required manual investigation by the operator + the model to figure out:

  • Which agents had committed locally but never pushed
  • Which had pushed but never PR'd
  • Which had PR'd but never merged
  • Which had done nothing at all

Expected behavior — pick one

Option A: Agents persist across session pause/resume. Long-running agents pick back up where they left off; their state is checkpointed to disk on pause.

Option B: On session resume, the harness reports clearly: "N background agents were terminated by session pause at TIMESTAMP. Their outputs may be incomplete. Last-known states: ..." — so the model and operator both know to investigate before continuing.

Actual behavior

Silent loss. Indistinguishable from "agents completed successfully and the notifications just haven't been read yet." Defeats the purpose of background agents for any workflow that spans more than a few hours of active session time.

Impact

  • Token waste — agents that almost-finished get nuked, and follow-up agents have to redo work
  • Wall-clock waste — operators discover the loss hours later after dispatching dependent batches
  • Decision making — the model believes the work shipped and proceeds; downstream batches reference work that doesn't actually exist on origin
  • Trust — operators stop trusting background-agent reports because they can't tell live from dead

Frequency

Encountered 4 times across multi-day sessions. Reliably reproduces every time a Claude Code session is paused for >several hours with running background agents.

Suggested mitigations

  • Persist agent state to disk across pause / resume (Option A above) — ideal.
  • At minimum, on session resume, emit a system reminder listing terminated-by-pause agents with their last-known IDs (Option B above) — would close the silent-loss trap.
  • Add a session_pause hook that lets the harness or user-defined scripts gracefully checkpoint / commit / push work-in-progress before the freeze.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING