claude-code - 💡(How to fix) Fix [BUG] Stream idle timeout / partial response during long tool-use turns on Claude Code Web (Opus 4.7, 1M and non-1M) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#49619Fetched 2026-04-17 08:36:03
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×5commented ×1

During turns where the assistant is about to produce a long text output (e.g. drafting a ~400-line markdown design doc after a few Read/Bash tool calls), the stream terminates with:

API Error: Stream idle timeout - partial response received

The error is not triggered by the tool calls themselves — tool results arrive normally. It consistently fires in the window between the last tool result and the start (or middle) of the long text reply.

Error Message

API Error: Stream idle timeout - partial response received

Root Cause

During turns where the assistant is about to produce a long text output (e.g. drafting a ~400-line markdown design doc after a few Read/Bash tool calls), the stream terminates with:

API Error: Stream idle timeout - partial response received

The error is not triggered by the tool calls themselves — tool results arrive normally. It consistently fires in the window between the last tool result and the start (or middle) of the long text reply.

Fix Action

Fix / Workaround

Workarounds tried (none fully effective)

  • Slimmed CLAUDE.md to reduce per-turn context overhead — still fails.
  • Switched claude-opus-4-7[1m]claude-opus-4-7 — still fails.
  • Starting a fresh session helps temporarily, but the error returns after the session grows.

Code Example

API Error: Stream idle timeout - partial response received
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Environment

  • Platform: Claude Code Web (claude.ai/code)
  • Model(s) affected:
    • claude-opus-4-7[1m] (1M context) — reproducible
    • claude-opus-4-7 (standard context) — also reproducible after switching
  • OS: Linux sandbox (provided by the web harness)
  • Session type: long-running conversation with multiple tool calls (Read / Bash / Grep), working in a git repo
  • Approx. transcript length when error first appeared: mid-session, after several dozen tool calls and a few long assistant messages

Summary

During turns where the assistant is about to produce a long text output (e.g. drafting a ~400-line markdown design doc after a few Read/Bash tool calls), the stream terminates with:

API Error: Stream idle timeout - partial response received

The error is not triggered by the tool calls themselves — tool results arrive normally. It consistently fires in the window between the last tool result and the start (or middle) of the long text reply.

Reproduction

  1. Open a Claude Code Web session with Opus 4.7 (1M) on a non-trivial repo (mine: ~several hundred markdown/py files, custom CLAUDE.md).
  2. Hold a long design discussion (tens of turns, many Read/Grep/Bash calls, several multi-paragraph replies).
  3. Ask the assistant to draft a long markdown document (~400+ lines) into a file, preceded by 1–2 exploratory tool calls.
  4. Observe Stream idle timeout - partial response received fire after the tool calls complete but before / during the long write.

Retry attempts (even after slimming CLAUDE.md and switching from claude-opus-4-7[1m] to plain claude-opus-4-7) reproduce the same error.

Actual behavior

Stream aborts mid-turn with an idle timeout; partial response is discarded from the user's perspective and the Write tool call never executes. The session is usable afterwards, but the same turn cannot be completed — it fails repeatedly at roughly the same point.

Impact

  • Blocks any workflow that involves drafting a long file in a single turn (design docs, protocol revisions, report packs).
  • Forces the user to manually split work into smaller chunks, losing the model's ability to produce a coherent document in one pass.
  • Switching to the non-1M model does not resolve it, so it does not appear to be strictly a 1M-context issue.

Workarounds tried (none fully effective)

  • Slimmed CLAUDE.md to reduce per-turn context overhead — still fails.
  • Switched claude-opus-4-7[1m]claude-opus-4-7 — still fails.
  • Starting a fresh session helps temporarily, but the error returns after the session grows.

Additional context

  • The moments when errors occur most frequently are always during the combination of “multi‑turn conversations + about to generate long markdown.”
  • The web version cannot call /help, and there is no way to report issues directly from the client, so I am submitting this issue manually.
  • Session ID : https://claude.ai/code/session_014AtytC2zxPcAqaoj9LwcGi

Ask

  1. Is the idle timeout threshold tunable (e.g. via a server-side setting or client flag)?
  2. Can the stream be kept alive with heartbeats during long text generation so long Write tool calls don't get cut off?
  3. Is there a known interaction between 1M context and long tail-end text streaming that we should avoid?

What Should Happen?

Expected behavior

The assistant should finish streaming the long reply, or at minimum fail with a retriable error that preserves the in-progress Write/Edit tool call.

Error Messages/Logs

API Error: Stream idle timeout - partial response received

Steps to Reproduce

  1. Open a Claude Code Web session with Opus 4.7 (1M) on a non-trivial repo (mine: ~several hundred markdown/py files, custom CLAUDE.md).
  2. Hold a long design discussion (tens of turns, many Read/Grep/Bash calls, several multi-paragraph replies).
  3. Ask the assistant to draft a long markdown document (~400+ lines) into a file, preceded by 1–2 exploratory tool calls.
  4. Observe Stream idle timeout - partial response received fire after the tool calls complete but before / during the long write.

Claude Model

Opus

Is this a regression?

I don't know

Last Working Version

No response

Claude Code Version

Web (claude.ai/code), encountered on 2026-04-17

Platform

Anthropic API

Operating System

Windows

Terminal/Shell

Other

Additional Information

Additional observation (session 014AtytC2zxPcAqaoj9LwcGi, 2026-04-17): Stream timeout fired during a ~210-line Edit call. The file write itself completed successfully (verified via git status post-timeout); only the client-side stream terminated. This suggests the issue is in keep-alive / heartbeat between tool-result-accepted and next-assistant- token, not in the tool execution itself.

extent analysis

TL;DR

Implementing a keep-alive or heartbeat mechanism during long text generation could potentially resolve the Stream idle timeout - partial response received error.

Guidance

  • Investigate the possibility of adjusting the idle timeout threshold, either through a server-side setting or a client flag, to accommodate longer text generation times.
  • Explore implementing a heartbeat or keep-alive mechanism between the client and server during long text generation to prevent the stream from timing out.
  • Consider optimizing the text generation process to reduce the time it takes to produce long responses, potentially by breaking up the response into smaller chunks or using a more efficient generation algorithm.
  • Review the interaction between the 1M context model and long text streaming to identify any potential issues or optimizations that can be made.

Example

No specific code example can be provided without more information about the underlying implementation, but a potential solution could involve sending periodic keep-alive messages from the client to the server during long text generation, such as:

import time

def generate_long_text():
    # Start generating long text
    start_time = time.time()
    while time.time() - start_time < 60:  # 1 minute timeout
        # Send keep-alive message to server
        send_keep_alive()
        # Generate next chunk of text
        yield next_chunk_of_text()

Notes

The exact solution will depend on the specifics of the implementation and the requirements of the system. Further investigation and testing will be necessary to determine the best approach.

Recommendation

Apply a workaround, such as implementing a keep-alive or heartbeat mechanism, to mitigate the Stream idle timeout - partial response received error until a more permanent solution can be found. This approach can help prevent the stream from timing out during long text generation, allowing the system to function more reliably.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The assistant should finish streaming the long reply, or at minimum fail with a retriable error that preserves the in-progress Write/Edit tool call.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING