claude-code - 💡(How to fix) Fix [BUG] Max effort silently retries API call on Opus 4.6, losing intermediate text visible to user [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#46398Fetched 2026-04-11 06:21:21
View on GitHub
Comments
1
Participants
1
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
labeled ×5cross-referenced ×2commented ×1

Error Message

  1. The user sees no error, no notification, no indication that a retry happened

Root Cause

Hypothesis About Root Cause

Code Example

line N+0 (user message): "[question about political economy / economic theory]"

line N+1: role=assistant, type=thinking
  msg.id:     msg_01KB5NA9...
  requestId:  req_011CZvcC...
  stop_reason: end_turn
  content: [1 thinking block ONLY, NO text block]
  timestamp:  19:03:01

line N+2: role=assistant, type=thinking
  msg.id:     msg_01A6zxNdV...DIFFERENT msg.id from line N+1
  requestId:  req_011CZvcC...SAME requestId
  stop_reason: end_turn
  content: [1 thinking block ONLY, NO text block]
  timestamp:  19:08:42            ← 5m41s after line N+1

line N+3: role=assistant, type=text
  msg.id:     msg_01A6zxNdV...Same as N+2
  requestId:  req_011CZvcC...
  stop_reason: end_turn
  content: [1 text block, 5819 chars — the final output]
  timestamp:  19:10:26
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report
  • I am using the latest version of Claude Code

What's Wrong?

When using max effort with Opus 4.6, I observed a rare but reproducible pattern where:

  1. The model begins its response normally (thinking → text streaming to TUI)
  2. After ~10 lines of visible text, the output silently disappears from the display
  3. The TUI returns to a "thinking" state
  4. After another multi-minute thinking phase, a completely new final response appears
  5. The user sees no error, no notification, no indication that a retry happened

The intermediate visible text is not recoverable — Ctrl+O transcript does not show it, and it is not present anywhere in the session JSONL.

Hard Evidence From session.jsonl

The session log clearly shows this is not interleaved thinking — it is two distinct API responses sharing the same requestId:

line N+0 (user message): "[question about political economy / economic theory]"

line N+1: role=assistant, type=thinking
  msg.id:     msg_01KB5NA9...
  requestId:  req_011CZvcC...
  stop_reason: end_turn
  content: [1 thinking block ONLY, NO text block]
  timestamp:  19:03:01

line N+2: role=assistant, type=thinking
  msg.id:     msg_01A6zxNdV...   ← DIFFERENT msg.id from line N+1
  requestId:  req_011CZvcC...    ← SAME requestId
  stop_reason: end_turn
  content: [1 thinking block ONLY, NO text block]
  timestamp:  19:08:42            ← 5m41s after line N+1

line N+3: role=assistant, type=text
  msg.id:     msg_01A6zxNdV...   ← Same as N+2
  requestId:  req_011CZvcC...
  stop_reason: end_turn
  content: [1 text block, 5819 chars — the final output]
  timestamp:  19:10:26

What this tells us

  1. Two separate msg.id values under one requestId = two separate API calls, not one response with interleaved thinking.
  2. First API call returned only a thinking block and no text block, yet stop_reason=end_turn (as if it completed normally).
  3. Second API call produced both a new thinking block and the final text.
  4. Gap of 5 minutes 41 seconds between the two API calls — this is when the user saw text appear on screen and then disappear.
  5. The intermediate text that was streamed to the TUI is not recorded anywhere in the session file.

Timing Matches User Observation Exactly

User reported: "first thinking ~2min → output ~10 lines → retract → thinking ~4min → final output"

JSONL evidence:

  • First thinking: 19:01:43 → 19:03:01 = 1min 18s (~"2min")
  • Gap (visible text then retract then new thinking): 19:03:01 → 19:08:42 = 5min 41s (~"4min" of thinking)
  • Final text streaming: 19:08:42 → 19:10:26 = 1min 44s
  • Total turn time: 8min 43s

What Should Happen

Either:

  • (A) If the first API call's text output was legitimate, it should be preserved in the session log and not discarded from the display.
  • (B) If the first API call's text output was rejected by some server-side validation (classifier, thinking-rewrite, etc.), the user should be explicitly notified rather than seeing a silent "retract" behavior.
  • (C) If this is a deliberate server-side retry mechanism, the TUI should at minimum show an indicator like "regenerating due to validation" so the user understands what happened.

Steps to Reproduce

This is hard to trigger reliably. I only saw it once across several hours of max effort usage. Conditions that seemed to contribute:

  1. /effort max (high-reasoning budget)
  2. Opus 4.6 (1M context)
  3. Long-running conversation (this session was 14,000+ lines into the JSONL)
  4. Question that is academically/intellectually complex and touches on politically or economically charged topics (in my case: a discussion of Marxist economic theory and its applicability to modern capitalism)

The combination of long thinking budget + politically charged topic + long context seems to increase the chance of the first API call producing something that triggers the silent retry.

Hypothesis About Root Cause

The evidence points to either:

  1. Server-side output validation rejecting the text block: first API call produces [thinking, text], validation layer rejects the text, API commits only the thinking, returns end_turn, client detects "no text block" and silently retries.
  2. Server-side thinking-rewrite mechanism (see #45804): some thinking validation pass decides the output needs regeneration and the client auto-retries.
  3. Streaming cancellation mid-text: the model started producing text, something interrupted the stream, only the already-finalized thinking block was committed.

All three would produce the same user-visible symptoms and JSONL pattern. The common feature is: intermediate text visible to user → lost → silent retry → final different output.

Why It's Worth Fixing Even If Rare

  • Users paying for max effort are already paying a premium; silent retries double-bill the same request (evidence from #45862)
  • The "vanishing text" experience is extremely disorienting — users think their terminal is broken or they hallucinated seeing the text
  • There is no way for the user to recover the original response, even if it was actually better than the retry
  • Violates the principle of least surprise: if a retry happens, the user should know

Claude Model

Opus 4.6 (1M context), max effort

Claude Code Version

2.1.101

Platform

Anthropic API (direct, Max plan)

Operating System

Arch Linux 6.19.11-arch1-1

Terminal/Shell

Ghostty + zsh

Related Issues

  • #45804 — thinking-rewrite validation layer (closest match, but different symptoms)
  • #45862 — medium effort default causing excessive token burn (same cost concern)
  • #42598 — Plan Mode text disappearing during Cogitating (related TUI symptom, different root cause)

Additional Information

I have the full JSONL fragment with exact msg.id, requestId, timestamps, and content block structures if the Anthropic team wants more detail for diagnosis — happy to share privately. The specific user question and model output contents are omitted here since they contain personal intellectual discussion, but the technical metadata is complete.

extent analysis

TL;DR

The issue can be addressed by modifying the client-side handling of API responses to detect and handle silent retries, providing a better user experience and preventing double-billing.

Guidance

  1. Implement retry detection: Modify the client to detect when a silent retry occurs by checking for multiple API calls with the same requestId and different msg.id values.
  2. Notify the user: Display a notification to the user when a silent retry is detected, indicating that the response is being regenerated due to validation or other issues.
  3. Preserve intermediate text: Consider preserving the intermediate text output in the session log, even if it is later discarded, to allow for debugging and analysis.
  4. Review server-side validation: Investigate the server-side validation and thinking-rewrite mechanisms to determine if they are causing the silent retries and if they can be modified to provide more explicit notifications to the user.

Example

// Example of a modified API response handler
if (response.requestId === previousRequestId && response.msgId !== previousMsgId) {
  // Silent retry detected, notify the user and preserve intermediate text
  console.log("Silent retry detected, regenerating response...");
  // Preserve intermediate text in session log
  sessionLog.push({ text: intermediateText, timestamp: new Date() });
}

Notes

The provided solution focuses on client-side modifications to detect and handle silent retries. However, the root cause of the issue may lie in the server-side validation or thinking-rewrite mechanisms, which would require further investigation and modification.

Recommendation

Apply a workaround by implementing retry detection and notification on the client-side, as this will provide a better user experience and prevent double-billing, while further investigation is conducted to address the root cause of the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Max effort silently retries API call on Opus 4.6, losing intermediate text visible to user [1 comments, 1 participants]