claude-code - 💡(How to fix) Fix Mid-stream ECONNRESET on /v1/messages is fatal; SSE side-channel auto-reconnects, main stream does not

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The Anthropic API returned 200 OK on a streaming /v1/messages request, stalled for 15s with zero bytes, sent one chunk, then RST the TCP connection ~1s later. The error surfaces to the user as:

API Error: The socket connection was closed unexpectedly. For more information, pass verbose: true in the second argument to fetch()

Notably, the SSE side-channel at /v1/code/sessions/.../worker/events/stream dropped at the same time and auto-reconnected within ~1.2s. The main /v1/messages stream has no equivalent retry, so any mid-stream blip is fatal and bubbles all the way up to the user.

Recurring pattern — also caught on 2026-05-19 with the same signature (40 flow failures, 17 upstream RSTs to api.anthropic.com:443 captured via Network.NWError at the macOS network observer layer).

Error Message

The Anthropic API returned 200 OK on a streaming /v1/messages request, stalled for 15s with zero bytes, sent one chunk, then RST the TCP connection ~1s later. The error surfaces to the user as:

API Error: The socket connection was closed unexpectedly. For more information, pass verbose: true in the second argument to fetch() 15:12:48.914 [WARN] [Stall] stream_idle_partial lastChunkAgeMs=15000 bytesTotal=0 idleDeadlineMs=300000 15:12:55.017 [ERROR] code=ECONNRESET, message=The socket connection was closed unexpectedly 15:12:55.018 [ERROR] API error x-client-request-id=d710bd12-4f52-4aeb-8b06-9bba36ff9cd8 15:13:01.810 [ERROR] SSETransport: Stream read error (worker/events/stream) — same error class 15:13:05.023 [WARN] CCRClient: PUT worker failed: The operation timed out. I have a StopFailure hook (~/.claude/hooks/auto-retry-on-transient-error.sh) that on socket-closed errors plays a sound and sends a "try again" keystroke via AppleScript to the matching iTerm session, with a 3-retry-per-session_id cap. 15:12:55.027 StopFailure:unknown [$HOME/.claude/hooks/auto-retry-on-transient-error.sh] completed with status 0 …but the retry did not actually take effect end-to-end — the session remained in the error state rather than recovering. Possible causes are local (AppleScript-iTerm session matching, terminal focus state, the "try again" keystroke arriving when Claude isn't accepting input), but they illustrate why a userland workaround at this layer is inherently brittle. Native client-side retry (ask #2) would resolve this whole class of issue without the user-built scaffolding. Not deterministic — happens organically when long-running streaming sessions encounter a transient network event. Seen across all 6 parallel sessions, roughly daily, with the same error signature.

Root Cause

The Anthropic API returned 200 OK on a streaming /v1/messages request, stalled for 15s with zero bytes, sent one chunk, then RST the TCP connection ~1s later. The error surfaces to the user as:

API Error: The socket connection was closed unexpectedly. For more information, pass verbose: true in the second argument to fetch()

Notably, the SSE side-channel at /v1/code/sessions/.../worker/events/stream dropped at the same time and auto-reconnected within ~1.2s. The main /v1/messages stream has no equivalent retry, so any mid-stream blip is fatal and bubbles all the way up to the user.

Recurring pattern — also caught on 2026-05-19 with the same signature (40 flow failures, 17 upstream RSTs to api.anthropic.com:443 captured via Network.NWError at the macOS network observer layer).

Fix Action

Fix / Workaround

Local workaround (and why it's not enough)

…but the retry did not actually take effect end-to-end — the session remained in the error state rather than recovering. Possible causes are local (AppleScript-iTerm session matching, terminal focus state, the "try again" keystroke arriving when Claude isn't accepting input), but they illustrate why a userland workaround at this layer is inherently brittle. Native client-side retry (ask #2) would resolve this whole class of issue without the user-built scaffolding.

Code Example

15:12:33.917  POST /v1/messages?beta=true200 OK (durationMs=2287)
15:12:48.914  [WARN] [Stall] stream_idle_partial lastChunkAgeMs=15000 bytesTotal=0 idleDeadlineMs=300000
15:12:53.980  Stream started — first chunk after ~20s of idle
15:12:55.017  [ERROR] code=ECONNRESET, message=The socket connection was closed unexpectedly
15:12:55.018  [ERROR] API error x-client-request-id=d710bd12-4f52-4aeb-8b06-9bba36ff9cd8
15:13:01.810  [ERROR] SSETransport: Stream read error (worker/events/stream) — same error class
15:13:01.811  [DEBUG] SSETransport: Reconnecting in 892ms (attempt 1)
15:13:03.016  [DEBUG] SSETransport: Connected   ← side-channel recovers
15:13:05.023  [WARN]  CCRClient: PUT worker failed: The operation timed out.

---

15:12:55.025  StopFailure:unknown [afplay … Sosumi.aiff] completed with status 0
15:12:55.027  StopFailure:unknown [\$HOME/.claude/hooks/auto-retry-on-transient-error.sh] completed with status 0
RAW_BUFFERClick to expand / collapse

Summary

The Anthropic API returned 200 OK on a streaming /v1/messages request, stalled for 15s with zero bytes, sent one chunk, then RST the TCP connection ~1s later. The error surfaces to the user as:

API Error: The socket connection was closed unexpectedly. For more information, pass verbose: true in the second argument to fetch()

Notably, the SSE side-channel at /v1/code/sessions/.../worker/events/stream dropped at the same time and auto-reconnected within ~1.2s. The main /v1/messages stream has no equivalent retry, so any mid-stream blip is fatal and bubbles all the way up to the user.

Recurring pattern — also caught on 2026-05-19 with the same signature (40 flow failures, 17 upstream RSTs to api.anthropic.com:443 captured via Network.NWError at the macOS network observer layer).

Environment

  • Claude Code: 2.1.146
  • macOS: 26.4
  • Shell: zsh
  • Node: v26.0.0
  • Running 6 parallel long-lived sessions in the same repo (via a claude-loop.sh wrapper that restarts on exit)

Forensic IDs (for server-log lookup)

FieldValue
request-idreq_011CbFvtMSTLZ7dWV3KFiQ6k
x-client-request-idd710bd12-4f52-4aeb-8b06-9bba36ff9cd8
cf-ray9ff48517ce9402d0-CWB
timestamp (UTC)2026-05-21T15:12:55.017Z

Timeline (from claude --debug --debug-file)

15:12:33.917  POST /v1/messages?beta=true → 200 OK (durationMs=2287)
15:12:48.914  [WARN] [Stall] stream_idle_partial lastChunkAgeMs=15000 bytesTotal=0 idleDeadlineMs=300000
15:12:53.980  Stream started — first chunk after ~20s of idle
15:12:55.017  [ERROR] code=ECONNRESET, message=The socket connection was closed unexpectedly
15:12:55.018  [ERROR] API error x-client-request-id=d710bd12-4f52-4aeb-8b06-9bba36ff9cd8
15:13:01.810  [ERROR] SSETransport: Stream read error (worker/events/stream) — same error class
15:13:01.811  [DEBUG] SSETransport: Reconnecting in 892ms (attempt 1)
15:13:03.016  [DEBUG] SSETransport: Connected   ← side-channel recovers
15:13:05.023  [WARN]  CCRClient: PUT worker failed: The operation timed out.

Two independent connections from the same client dropped within seconds of each other — points to a server-side / edge-network event rather than local network conditions.

Asks

  1. Server-side investigation. Please pull logs for the request IDs above. The stall-then-RST signature matches the 2026-05-19 incident, suggesting a recurring upstream issue (load balancer flap, edge restart, Cloudflare origin RST — exact failure mode opaque from the client side).

  2. Client-side: parity with SSETransport. Add retry-on-mid-stream-ECONNRESET to /v1/messages streaming. The SSE side-channel already does this cleanly (see SSETransport: Reconnecting in 892ms (attempt 1) above). The main message stream having no equivalent means every transient network blip is user-visible and breaks long-running sessions.

Local workaround (and why it's not enough)

I have a StopFailure hook (~/.claude/hooks/auto-retry-on-transient-error.sh) that on socket-closed errors plays a sound and sends a "try again" keystroke via AppleScript to the matching iTerm session, with a 3-retry-per-session_id cap.

In this incident, both hooks fired and exited 0:

15:12:55.025  StopFailure:unknown [afplay … Sosumi.aiff] completed with status 0
15:12:55.027  StopFailure:unknown [\$HOME/.claude/hooks/auto-retry-on-transient-error.sh] completed with status 0

…but the retry did not actually take effect end-to-end — the session remained in the error state rather than recovering. Possible causes are local (AppleScript-iTerm session matching, terminal focus state, the "try again" keystroke arriving when Claude isn't accepting input), but they illustrate why a userland workaround at this layer is inherently brittle. Native client-side retry (ask #2) would resolve this whole class of issue without the user-built scaffolding.

Reproduction

Not deterministic — happens organically when long-running streaming sessions encounter a transient network event. Seen across all 6 parallel sessions, roughly daily, with the same error signature.

claude: 2.1.146 (Claude Code) macOS: 26.4

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Mid-stream ECONNRESET on /v1/messages is fatal; SSE side-channel auto-reconnects, main stream does not