claude-code - 💡(How to fix) Fix [BUG] Claude Code CLI hangs indefinitely after a successful tool_use response is lost in streaming receive [1 comments, 2 participants]

claude-code2026-04-25 18:30:41

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#53328•Fetched 2026-04-26 05:18:34

View on GitHub

Comments

Participants

Timeline

Reactions

Author

kimesbenjorgensen

Participants

github-actions[bot]

kimesbenjorgensen

Timeline (top)

labeled ×4commented ×1

When Claude Code runs in a tmux pane and executes a tool call that completes successfully server-side, there is a window in which the CLI can lose the streamed tool_result and the UI remains in the Running… / Undulating… (Ns · ↓ N tokens) state forever. The HTTP connections stay healthy and idle (no kernel-side queued data), the main thread enters a low-CPU livelock (wakes ~100×/sec in epoll_wait with no forward progress), and no transcript events are written. Pressing ESC in the terminal cleanly cancels the in-flight receive; the CLI then re-syncs and reports the operation had already succeeded.

Error Message

Hard deadline on streaming receives. If no bytes arrive for N seconds on an in-flight tool_result receive, abort with an error the UI can surface and optionally auto-retry.

If the streaming receive fails, the CLI should surface an error or auto-retry, not hang silently.

Error Messages/Logs

Root Cause

No destructive action was required — tmux send-keys -t <pane> Escape cleaned up the state fully.
A user-side watchdog now auto-detects and auto-recovers this specific pattern (sends Escape via tmux send-keys). That is a workaround, not a fix.
After the recent CLI upgrade this hang has become very common in our environment — multiple times per hour during active use. Filing now because the watchdog can mask the symptom but not the root cause.

Fix Action

Fix / Workaround

UI stuck on Running… indefinitely. Receive indicator shows a fixed byte count (e.g. ↓ 23.5k tokens) for hours with no progression.
Queue events (queue-operation/enqueue) accumulate as the user types into the stuck prompt, but are never dequeued.
No transcript events are written for the duration of the hang.
ESC cancels the stuck receive; the CLI then discovers the tool had already succeeded and continues normally.

Suggested mitigations

Hard deadline on streaming receives. If no bytes arrive for N seconds on an in-flight tool_result receive, abort with an error the UI can surface and optionally auto-retry.
Dispatch-layer healthcheck. The main event loop waking ~100×/sec with no progress is a diagnosable signature; a lightweight heartbeat comparing expected-advancing state vs. observed could detect the livelock and emit diagnostics.
Persist enough state that an in-flight receive can resume after an ESC. Today ESC simply drops the receive; the CLI then relies on re-sending the context. For long tool results this is wasteful. A checkpoint of last-seen server-side event id would let the client resume.

Code Example

Not deterministic on demand. Organic reproduction captured 2026-04-24 on:

- CLI version: `2.1.118`
- Platform: Linux (LXC container, Proxmox host, Debian 12)
- Terminal: `tmux` 3.3a
- Session start: 2026-04-23 16:09 CEST
- Hang start: 2026-04-24 19:31:55 UTC (`toolu_01KkzPcqKkt2FBQ6Mt7jHg8S`)
- Hang duration: 2 h 13 m 53 s until `ESC` recovered it at 21:45:48 UTC

The triggering tool call was a compound Bash command of the shape:


git stash push -m "pre-pull stash" <file-a> <file-b> && git pull --rebase && git stash pop


The command succeeded on disk (confirmed via `/status` after recovery: *"Git state: clean rebase (stash/pop completed successfully)"*). The hang was on the client's side of the receive stream.

RAW_BUFFERClick to expand / collapse

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report (please file separate reports for different bugs)
I am using the latest version of Claude Code

What's Wrong?

Claude Code CLI hangs indefinitely after a successful tool_use response is lost in streaming receive

Closest existing issues: #46767 ("missing tool result" regression). Also related: #47996, #50727, #50185, #44783, #52544. None appear fully fixed — symptoms keep resurfacing.

Summary

Actual

UI stuck on Running… indefinitely. Receive indicator shows a fixed byte count (e.g. ↓ 23.5k tokens) for hours with no progression.
Queue events (queue-operation/enqueue) accumulate as the user types into the stuck prompt, but are never dequeued.
No transcript events are written for the duration of the hang.
ESC cancels the stuck receive; the CLI then discovers the tool had already succeeded and continues normally.

Forensic evidence

Captured while hung (before sending ESC):

Source	Shows
`/proc/<pid>/status`, thread stacks	Main thread in `do_epoll_wait`; 11 threads; process alive
`lsof` + `ss -tnpo`	4 `ESTABLISHED` TLS connections to the API endpoint with zero Recv-Q / Send-Q — kernel sees no buffered data
`/proc` polling (6 samples, 2s apart)	~8% main-thread CPU, ~10% total over 20s. Normal idle CLI is < 1%
Process tree	tmux pane still attached; pty intact; not a detached-pty scenario
Transcript `.jsonl`	Last event before hang is the tool_use at 19:31:55; no further writes for 2h13m

Empty Recv-Q / Send-Q on all four connections is the key data point: there is no buffered data for the CLI to read. Either the server's response was already fully consumed at the kernel-user-space boundary (and something in the CLI failed to process it), or it never arrived and no timeout fires.

Happy to share full sanitized captures (proc-state, thread-stacks, fd-and-sockets, CPU-poll samples) if useful — drop a note here.

Hypotheses

Race in SSE / streaming-response parser. A final event (likely the message_stop or equivalent) arrives concurrently with an internal state transition and is dropped; the fetch promise neither resolves nor rejects.
Timer drift in keepalive / watchdog. The CLI's own watchdog for stuck requests may be reset by the process's hot-polling loop and never fires.
Interaction with tmux SIGWINCH or raw-mode toggle. Two /dev/pts/* fds were opened around the time the hang set in (prior to the triggering tool call), suggesting a raw-mode / signal-handler pipe manipulation. A dropped signal could leave an event-loop handler dangling.

Suggested mitigations

Hard deadline on streaming receives. If no bytes arrive for N seconds on an in-flight tool_result receive, abort with an error the UI can surface and optionally auto-retry.
Dispatch-layer healthcheck. The main event loop waking ~100×/sec with no progress is a diagnosable signature; a lightweight heartbeat comparing expected-advancing state vs. observed could detect the livelock and emit diagnostics.
Persist enough state that an in-flight receive can resume after an ESC. Today ESC simply drops the receive; the CLI then relies on re-sending the context. For long tool results this is wasteful. A checkpoint of last-seen server-side event id would let the client resume.

Notes

No destructive action was required — tmux send-keys -t <pane> Escape cleaned up the state fully.
A user-side watchdog now auto-detects and auto-recovers this specific pattern (sends Escape via tmux send-keys). That is a workaround, not a fix.
After the recent CLI upgrade this hang has become very common in our environment — multiple times per hour during active use. Filing now because the watchdog can mask the symptom but not the root cause.

What Should Happen?

After a successful tool invocation, the tool_result is delivered and the conversation continues.
If the streaming receive fails, the CLI should surface an error or auto-retry, not hang silently.

Error Messages/Logs

Not deterministic on demand. Organic reproduction captured 2026-04-24 on:

- CLI version: `2.1.118`
- Platform: Linux (LXC container, Proxmox host, Debian 12)
- Terminal: `tmux` 3.3a
- Session start: 2026-04-23 16:09 CEST
- Hang start: 2026-04-24 19:31:55 UTC (`toolu_01KkzPcqKkt2FBQ6Mt7jHg8S`)
- Hang duration: 2 h 13 m 53 s until `ESC` recovered it at 21:45:48 UTC

The triggering tool call was a compound Bash command of the shape:


git stash push -m "pre-pull stash" <file-a> <file-b> && git pull --rebase && git stash pop


The command succeeded on disk (confirmed via `/status` after recovery: *"Git state: clean rebase (stash/pop completed successfully)"*). The hang was on the client's side of the receive stream.

Steps to Reproduce

Run Claude in a detatched tmux on ubuntu 24
Start /remote-control
Detatch the terminal
Remotely manage via mobile app or desktop app.
Use the session activly for a longer period

Claude Model

Opus

Is this a regression?

No, this never worked

Last Working Version

No response

Claude Code Version

2.1.119

Platform

Anthropic API

Operating System

Ubuntu/Debian Linux

Terminal/Shell

Other

Additional Information

I'm suspecting the hit-rate is higher when auto-mode is not used, but I can't prove that,. Sometimes it works well for 24h, sometimes just a few minutes.

extent analysis

TL;DR

Implement a hard deadline on streaming receives to abort and retry the operation if no bytes arrive for a specified time.

Guidance

Investigate the streaming-response parser for potential race conditions that could cause the tool_result to be lost.
Consider adding a dispatch-layer healthcheck to detect livelocks and emit diagnostics.
Implement a hard deadline on streaming receives, such as aborting with an error and optionally auto-retrying if no bytes arrive for N seconds.
Review the interaction with tmux SIGWINCH or raw-mode toggle to ensure that signal handlers are properly managed.

Example

No code snippet is provided as the issue does not contain sufficient information to create a specific example.

Notes

The provided information suggests that the issue is related to the streaming receive mechanism and the interaction with tmux. However, without more detailed information about the code and the specific conditions under which the issue occurs, it is difficult to provide a more specific solution.

Recommendation

Apply a workaround by implementing a hard deadline on streaming receives to abort and retry the operation if no bytes arrive for a specified time. This can help mitigate the issue until a more permanent fix can be found.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #serialization error #model compatibility #GPU setup #container setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

claude-code - 💡(How to fix) Fix [BUG] Claude Code CLI hangs indefinitely after a successful tool_use response is lost in streaming receive [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

Fix Action

Fix / Workaround

Suggested mitigations

Code Example

Preflight Checklist

What's Wrong?

Claude Code CLI hangs indefinitely after a successful tool_use response is lost in streaming receive

Summary

Actual

Forensic evidence

Hypotheses

Suggested mitigations

Notes

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING