claude-code - 💡(How to fix) Fix [BUG] Claude Code CLI hangs indefinitely after a successful tool_use response is lost in streaming receive [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#53328Fetched 2026-04-26 05:18:34
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
labeled ×4commented ×1

When Claude Code runs in a tmux pane and executes a tool call that completes successfully server-side, there is a window in which the CLI can lose the streamed tool_result and the UI remains in the Running… / Undulating… (Ns · ↓ N tokens) state forever. The HTTP connections stay healthy and idle (no kernel-side queued data), the main thread enters a low-CPU livelock (wakes ~100×/sec in epoll_wait with no forward progress), and no transcript events are written. Pressing ESC in the terminal cleanly cancels the in-flight receive; the CLI then re-syncs and reports the operation had already succeeded.

Error Message

  1. Hard deadline on streaming receives. If no bytes arrive for N seconds on an in-flight tool_result receive, abort with an error the UI can surface and optionally auto-retry.
  • If the streaming receive fails, the CLI should surface an error or auto-retry, not hang silently.

Error Messages/Logs

Root Cause

  • No destructive action was required — tmux send-keys -t <pane> Escape cleaned up the state fully.
  • A user-side watchdog now auto-detects and auto-recovers this specific pattern (sends Escape via tmux send-keys). That is a workaround, not a fix.
  • After the recent CLI upgrade this hang has become very common in our environment — multiple times per hour during active use. Filing now because the watchdog can mask the symptom but not the root cause.

Fix Action

Fix / Workaround

  • UI stuck on Running… indefinitely. Receive indicator shows a fixed byte count (e.g. ↓ 23.5k tokens) for hours with no progression.
  • Queue events (queue-operation/enqueue) accumulate as the user types into the stuck prompt, but are never dequeued.
  • No transcript events are written for the duration of the hang.
  • ESC cancels the stuck receive; the CLI then discovers the tool had already succeeded and continues normally.

Suggested mitigations

  1. Hard deadline on streaming receives. If no bytes arrive for N seconds on an in-flight tool_result receive, abort with an error the UI can surface and optionally auto-retry.
  2. Dispatch-layer healthcheck. The main event loop waking ~100×/sec with no progress is a diagnosable signature; a lightweight heartbeat comparing expected-advancing state vs. observed could detect the livelock and emit diagnostics.
  3. Persist enough state that an in-flight receive can resume after an ESC. Today ESC simply drops the receive; the CLI then relies on re-sending the context. For long tool results this is wasteful. A checkpoint of last-seen server-side event id would let the client resume.

Code Example

Not deterministic on demand. Organic reproduction captured 2026-04-24 on:

- CLI version: `2.1.118`
- Platform: Linux (LXC container, Proxmox host, Debian 12)
- Terminal: `tmux` 3.3a
- Session start: 2026-04-23 16:09 CEST
- Hang start: 2026-04-24 19:31:55 UTC (`toolu_01KkzPcqKkt2FBQ6Mt7jHg8S`)
- Hang duration: 2 h 13 m 53 s until `ESC` recovered it at 21:45:48 UTC

The triggering tool call was a compound Bash command of the shape:


git stash push -m "pre-pull stash" <file-a> <file-b> && git pull --rebase && git stash pop


The command succeeded on disk (confirmed via `/status` after recovery: *"Git state: clean rebase (stash/pop completed successfully)"*). The hang was on the client's side of the receive stream.
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Claude Code CLI hangs indefinitely after a successful tool_use response is lost in streaming receive

Closest existing issues: #46767 ("missing tool result" regression). Also related: #47996, #50727, #50185, #44783, #52544. None appear fully fixed — symptoms keep resurfacing.

Summary

When Claude Code runs in a tmux pane and executes a tool call that completes successfully server-side, there is a window in which the CLI can lose the streamed tool_result and the UI remains in the Running… / Undulating… (Ns · ↓ N tokens) state forever. The HTTP connections stay healthy and idle (no kernel-side queued data), the main thread enters a low-CPU livelock (wakes ~100×/sec in epoll_wait with no forward progress), and no transcript events are written. Pressing ESC in the terminal cleanly cancels the in-flight receive; the CLI then re-syncs and reports the operation had already succeeded.

Actual

  • UI stuck on Running… indefinitely. Receive indicator shows a fixed byte count (e.g. ↓ 23.5k tokens) for hours with no progression.
  • Queue events (queue-operation/enqueue) accumulate as the user types into the stuck prompt, but are never dequeued.
  • No transcript events are written for the duration of the hang.
  • ESC cancels the stuck receive; the CLI then discovers the tool had already succeeded and continues normally.

Forensic evidence

Captured while hung (before sending ESC):

SourceShows
/proc/<pid>/status, thread stacksMain thread in do_epoll_wait; 11 threads; process alive
lsof + ss -tnpo4 ESTABLISHED TLS connections to the API endpoint with zero Recv-Q / Send-Q — kernel sees no buffered data
/proc polling (6 samples, 2s apart)~8% main-thread CPU, ~10% total over 20s. Normal idle CLI is < 1%
Process treetmux pane still attached; pty intact; not a detached-pty scenario
Transcript .jsonlLast event before hang is the tool_use at 19:31:55; no further writes for 2h13m

Empty Recv-Q / Send-Q on all four connections is the key data point: there is no buffered data for the CLI to read. Either the server's response was already fully consumed at the kernel-user-space boundary (and something in the CLI failed to process it), or it never arrived and no timeout fires.

Happy to share full sanitized captures (proc-state, thread-stacks, fd-and-sockets, CPU-poll samples) if useful — drop a note here.

Hypotheses

  1. Race in SSE / streaming-response parser. A final event (likely the message_stop or equivalent) arrives concurrently with an internal state transition and is dropped; the fetch promise neither resolves nor rejects.
  2. Timer drift in keepalive / watchdog. The CLI's own watchdog for stuck requests may be reset by the process's hot-polling loop and never fires.
  3. Interaction with tmux SIGWINCH or raw-mode toggle. Two /dev/pts/* fds were opened around the time the hang set in (prior to the triggering tool call), suggesting a raw-mode / signal-handler pipe manipulation. A dropped signal could leave an event-loop handler dangling.

Suggested mitigations

  1. Hard deadline on streaming receives. If no bytes arrive for N seconds on an in-flight tool_result receive, abort with an error the UI can surface and optionally auto-retry.
  2. Dispatch-layer healthcheck. The main event loop waking ~100×/sec with no progress is a diagnosable signature; a lightweight heartbeat comparing expected-advancing state vs. observed could detect the livelock and emit diagnostics.
  3. Persist enough state that an in-flight receive can resume after an ESC. Today ESC simply drops the receive; the CLI then relies on re-sending the context. For long tool results this is wasteful. A checkpoint of last-seen server-side event id would let the client resume.

Notes

  • No destructive action was required — tmux send-keys -t <pane> Escape cleaned up the state fully.
  • A user-side watchdog now auto-detects and auto-recovers this specific pattern (sends Escape via tmux send-keys). That is a workaround, not a fix.
  • After the recent CLI upgrade this hang has become very common in our environment — multiple times per hour during active use. Filing now because the watchdog can mask the symptom but not the root cause.

What Should Happen?

  • After a successful tool invocation, the tool_result is delivered and the conversation continues.
  • If the streaming receive fails, the CLI should surface an error or auto-retry, not hang silently.

Error Messages/Logs

Not deterministic on demand. Organic reproduction captured 2026-04-24 on:

- CLI version: `2.1.118`
- Platform: Linux (LXC container, Proxmox host, Debian 12)
- Terminal: `tmux` 3.3a
- Session start: 2026-04-23 16:09 CEST
- Hang start: 2026-04-24 19:31:55 UTC (`toolu_01KkzPcqKkt2FBQ6Mt7jHg8S`)
- Hang duration: 2 h 13 m 53 s until `ESC` recovered it at 21:45:48 UTC

The triggering tool call was a compound Bash command of the shape:


git stash push -m "pre-pull stash" <file-a> <file-b> && git pull --rebase && git stash pop


The command succeeded on disk (confirmed via `/status` after recovery: *"Git state: clean rebase (stash/pop completed successfully)"*). The hang was on the client's side of the receive stream.

Steps to Reproduce

  1. Run Claude in a detatched tmux on ubuntu 24
  2. Start /remote-control
  3. Detatch the terminal
  4. Remotely manage via mobile app or desktop app.
  5. Use the session activly for a longer period

Claude Model

Opus

Is this a regression?

No, this never worked

Last Working Version

No response

Claude Code Version

2.1.119

Platform

Anthropic API

Operating System

Ubuntu/Debian Linux

Terminal/Shell

Other

Additional Information

I'm suspecting the hit-rate is higher when auto-mode is not used, but I can't prove that,. Sometimes it works well for 24h, sometimes just a few minutes.

extent analysis

TL;DR

Implement a hard deadline on streaming receives to abort and retry the operation if no bytes arrive for a specified time.

Guidance

  • Investigate the streaming-response parser for potential race conditions that could cause the tool_result to be lost.
  • Consider adding a dispatch-layer healthcheck to detect livelocks and emit diagnostics.
  • Implement a hard deadline on streaming receives, such as aborting with an error and optionally auto-retrying if no bytes arrive for N seconds.
  • Review the interaction with tmux SIGWINCH or raw-mode toggle to ensure that signal handlers are properly managed.

Example

No code snippet is provided as the issue does not contain sufficient information to create a specific example.

Notes

The provided information suggests that the issue is related to the streaming receive mechanism and the interaction with tmux. However, without more detailed information about the code and the specific conditions under which the issue occurs, it is difficult to provide a more specific solution.

Recommendation

Apply a workaround by implementing a hard deadline on streaming receives to abort and retry the operation if no bytes arrive for a specified time. This can help mitigate the issue until a more permanent fix can be found.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Claude Code CLI hangs indefinitely after a successful tool_use response is lost in streaming receive [1 comments, 2 participants]