claude-code - 💡(How to fix) Fix Frequent ECONNRESET failures with Opus 4.6 (1M context) [2 comments, 3 participants]

houmanb · 2026-04-08T11:48:44Z

[claude-code] Preflight Checklist - x I have searched existing issues https://github.com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Abu… ## Fix / Workaround ⏺ 1. Start Claude Code CLI (v2.1.96) with model claude-opus-4-6 (1M context) 2. Open a conversation in a medium-to-large codebase (~50+ files) 3. Perform heavy tool use that grows context rapidly: - Read multiple large files (500+ lines each) - Dispatch Agent subagents for codebase exploration - Read a large spec/document (~1400 lines, ~25K tokens) - Write a large output file (~800 lines) 4. Continue the conversation past ~100K tokens of context 5. At some point during a tool call or response, the connection drops with ECONNRESET 6. Retries fail repeatedly (up to attempt 6-10 of 10) before eventually recovering or timing out Observations: - The failure is always ECONNRESET — never a timeout, never a 429, never a 500. The TCP connection is reset by the remote side. - Retry backoff doesn't help — if attempt 1 fails, attempts 2-6 usually fail too, suggesting the issue is sustained (seconds to minutes), not a single dropped packet. - When it recovers, it works fine for a while before failing again — no gradual degradation. - This pattern started recently. The same workflow (heavy tool use, large context, Opus 4.6 1M) worked without these failures in prior weeks. - Not related to specific tool types — fails during file reads, Agent dispatches, and plain text responses. ### Preflight Checklist - [x] I have searched [existing issues](https://github.com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Abug) and this hasn't been reported yet - [x] This is a single bug report (please file separate reports for different bugs) - [x] I am using the latest version of Claude Code ### What's Wrong? Description Frequent ECONNRESET errors during extended conversations with claude-opus-4-6[1m]. The API drops the TCP connection, triggering retry loops (up to attempt 6-10 of 10). Sometimes recovers after retries, sometimes requires restarting the conversation entirely. Environment - CLI version: 2.1.96 - Model: claude-opus-4-6[1m] (1M context) - Platform: macOS Darwin 24.5.0 - Network: Stable — Docker builds, K8s deploys, and other HTTP traffic work fine during failures Reproduction 1. Start a conversation with Opus 4.6 (1M context) 2. Use heavy tool use (multi-file reads, Agent subagents, codebase exploration) 3. Continue until context reaches ~100K-500K tokens 4. Failures start appearing: Unable to connect to API (ECONNRESET) Frequency ~20 failures over a 24-hour period (2026-04-07 to 2026-04-08). Not transient — retries often fail multiple times before recovering. Error Output Unable to connect to API (ECONNRESET) Retrying in 10 seconds… (attempt 6/10) Notes - Seems correlated with larger context sizes and heavy tool use - Local network is confirmed stable during failures - Other models/endpoints are not tested for comparison ### What Should Happen? API connections should remain stable throughout extended conversations regardless of context size. When using the 1M context model with heavy tool use, the connection should not drop. If the server needs to close a connection, it should do so gracefully — not with a TCP reset — and the client should be able to reconnect on the first retry, not after 6+ attempts. ### Error Messages/Logs ```shell ⏺ Unable to connect to API (ECONNRESET) Retrying in 10 seconds… (attempt 1/10) Unable to connect to API (ECONNRESET) Retrying in 10 seconds… (attempt 2/10) Unable to connect to API (ECONNRESET) Retrying in 10 seconds… (attempt 3/10) Unable to connect to API (ECONNRESET) Retrying in 10 seconds… (attempt 4/10) Unable to connect to API (ECONNRESET) Retrying in 10 seconds… (attempt 5/10) Unable to connect to API (ECONNRESET) Retrying in 10 seconds… (attempt 6/10) This repeats ~20 times over a 24-hour period. Occurs mid-conversation during tool use — not at startup or idle. ``` ### Steps to Reproduce ⏺ 1. Start Claude Code CLI (v2.1.96) with model claude-opus-4-6 (1M context) 2. Open a conversation in a medium-to-large codebase (~50+ files) 3. Perform heavy tool use that grows context rapidly: - Read multiple large files (500+ lines each) - Dispatch Agent subagents for codebase exploration - Read a large spec/document (~1400 lines, ~25K tokens) - Write a large output file (~800 lines) 4. Continue the conversation past ~100K tokens of context 5. At some point during a tool call or response, the connection drops with ECONNRESET 6. Retries fail repeatedly (up to attempt 6-10 of 10) before eventually recovering or timing out Notes on reproduction: - Not deterministic — happens intermittently but frequently (~20 times in 24 hours) - More likely during heavy parallel tool use (multiple file reads, Agent spawns) - Never happens at conversation start — only after significant context accumulation - Local network remains stable during failures (other HTTP traffic works fine) - Observed over multiple separate

Fix Action

Fix / Workaround

⏺ 1. Start Claude Code CLI (v2.1.96) with model claude-opus-4-6 (1M context)
2. Open a conversation in a medium-to-large codebase (~50+ files)
3. Perform heavy tool use that grows context rapidly:
- Read multiple large files (500+ lines each) - Dispatch Agent subagents for codebase exploration
- Read a large spec/document (~1400 lines, ~25K tokens) - Write a large output file (~800 lines)
4. Continue the conversation past ~100K tokens of context
5. At some point during a tool call or response, the connection drops with ECONNRESET
6. Retries fail repeatedly (up to attempt 6-10 of 10) before eventually recovering or timing out

Observations:

The failure is always ECONNRESET — never a timeout, never a 429, never a 500. The TCP connection is reset by the remote side.
Retry backoff doesn't help — if attempt 1 fails, attempts 2-6 usually fail too, suggesting the issue is sustained (seconds to
minutes), not a single dropped packet.
When it recovers, it works fine for a while before failing again — no gradual degradation.
This pattern started recently. The same workflow (heavy tool use, large context, Opus 4.6 1M) worked without these failures in prior weeks.
Not related to specific tool types — fails during file reads, Agent dispatches, and plain text responses.

Code Example

⏺ Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 1/10)
                                                                                                                                        
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 2/10)                                                                                                
                                                                                                                                        
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 3/10)                                                                                                
                                                  
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 4/10)
                                                                                                                                        
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 5/10)                                                                                                
                                                  
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 6/10)
                                                                                                                                        
  This repeats ~20 times over a 24-hour period. Occurs mid-conversation during tool use — not at startup or idle.

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report (please file separate reports for different bugs)
I am using the latest version of Claude Code

What's Wrong?

Description

Frequent ECONNRESET errors during extended conversations with claude-opus-4-6[1m]. The API drops the TCP connection, triggering retry loops (up to attempt 6-10 of 10). Sometimes recovers after retries, sometimes requires restarting the conversation entirely.

Environment

CLI version: 2.1.96
Model: claude-opus-4-6[1m] (1M context)
Platform: macOS Darwin 24.5.0
Network: Stable — Docker builds, K8s deploys, and other HTTP traffic work fine during failures

Reproduction

Start a conversation with Opus 4.6 (1M context)
Use heavy tool use (multi-file reads, Agent subagents, codebase exploration)
Continue until context reaches ~100K-500K tokens
Failures start appearing: Unable to connect to API (ECONNRESET)

Frequency

~20 failures over a 24-hour period (2026-04-07 to 2026-04-08). Not transient — retries often fail multiple times before recovering.

Error Output

Unable to connect to API (ECONNRESET) Retrying in 10 seconds… (attempt 6/10)

Notes

Seems correlated with larger context sizes and heavy tool use
Local network is confirmed stable during failures
Other models/endpoints are not tested for comparison

What Should Happen?

API connections should remain stable throughout extended conversations regardless of context size. When using the 1M context model
with heavy tool use, the connection should not drop. If the server needs to close a connection, it should do so gracefully — not with a TCP reset — and the client should be able to reconnect on the first retry, not after 6+ attempts.

Error Messages/Logs

⏺ Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 1/10)
                                                                                                                                        
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 2/10)                                                                                                
                                                                                                                                        
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 3/10)                                                                                                
                                                  
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 4/10)
                                                                                                                                        
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 5/10)                                                                                                
                                                  
  Unable to connect to API (ECONNRESET)
  Retrying in 10 seconds… (attempt 6/10)
                                                                                                                                        
  This repeats ~20 times over a 24-hour period. Occurs mid-conversation during tool use — not at startup or idle.

Steps to Reproduce

Notes on reproduction:

Not deterministic — happens intermittently but frequently (~20 times in 24 hours)
More likely during heavy parallel tool use (multiple file reads, Agent spawns)
Never happens at conversation start — only after significant context accumulation
Local network remains stable during failures (other HTTP traffic works fine)
Observed over multiple separate conversations, not just one session

Claude Model

Not sure / Multiple models

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

2.1.96 (Claude Code)

Platform

Anthropic API

Operating System

macOS

Terminal/Shell

Terminal.app (macOS)

Additional Information

No proxy, no VPN — direct internet connection
Claude Code installed via npm
Using multiple MCP servers (GenMem, Playwright, Serena, Context7, Sequential Thinking)
Conversation typically involves 10-30 tool calls before failure occurs

Observations:

The failure is always ECONNRESET — never a timeout, never a 429, never a 500. The TCP connection is reset by the remote side.
Retry backoff doesn't help — if attempt 1 fails, attempts 2-6 usually fail too, suggesting the issue is sustained (seconds to
minutes), not a single dropped packet.
When it recovers, it works fine for a while before failing again — no gradual degradation.
This pattern started recently. The same workflow (heavy tool use, large context, Opus 4.6 1M) worked without these failures in prior weeks.
Not related to specific tool types — fails during file reads, Agent dispatches, and plain text responses.

Hypothesis: Possible connection pooling or keep-alive issue on the API server side when handling large-context 1M model requests. The sustained retry failures suggest the server is rejecting new connections temporarily, not just dropping one.

extent analysis

TL;DR

The most likely fix for the frequent ECONNRESET errors during extended conversations with the Claude Code API is to investigate and address potential connection pooling or keep-alive issues on the API server side.

Guidance

Investigate the API server's connection pooling and keep-alive settings to ensure they are properly configured to handle large-context 1M model requests.
Verify that the API server is not rejecting new connections temporarily, which could be causing the sustained retry failures.
Consider implementing a more robust retry mechanism on the client-side to handle temporary connection drops.
Test the API with smaller context sizes and lighter tool use to see if the issue is specific to large-context requests.

Example

No code snippet is provided as the issue seems to be related to the API server configuration rather than the client-side code.

Notes

The issue may be specific to the Opus 4.6 1M model and heavy tool use, so testing with other models and use cases may help isolate the problem. The fact that the issue started recently and the same workflow worked without failures in prior weeks suggests a potential change on the API server side.

Recommendation

Apply a workaround by implementing a more robust retry mechanism on the client-side and investigating the API server's connection pooling and keep-alive settings. This is because the issue seems to be related to the API server configuration, and a workaround may be necessary until the root cause is addressed.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Frequent ECONNRESET failures with Opus 4.6 (1M context) [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Fix Action

Fix / Workaround

Code Example

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Frequent ECONNRESET failures with Opus 4.6 (1M context) [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Fix Action

Fix / Workaround

Code Example

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING