claude-code - 💡(How to fix) Fix [Bug] Opus 4.7: 400 `thinking blocks ... cannot be modified` on long extended-thinking sessions, triggered by history-altering events (scheduled prompts / parallel tool-call cancellation)

StepCodex · 2026-05-28T15:17:02Z

[claude-code] On a long-running Claude Code session using Opus 4.7 with extended interleaved thinking, the session reaches a state where it returns this API er… On a long-running Claude Code session using Opus 4.7 with extended (interleaved) thinking, the session reaches a state where it returns this API error on essentially every turn: ``` API Error: 400 messages. .content. : `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response. ``` The failure is not caused by user edits or by any client-side middleware. It appears to be Claude Code reconstructing/re-editing the conversation history in a way that desyncs an interleaved thinking block's signature, after which the API rejects the request. Once it starts, the session is effectively unusable: rewinding past the offending turn restores function only briefly before the next history-altering event re-triggers it. ## Fix / Workaround - A long-running extended-thinking session becomes unusable: every turn 400s. - Rewind is only a temporary workaround; scheduled prompts in particular re-trigger it without any user action, so the session cannot be kept alive. - The only durable recovery is starting a fresh session, losing the working context. ## Summary On a long-running Claude Code session using Opus 4.7 with extended (interleaved) thinking, the session reaches a state where it returns this API error on essentially every turn: ``` API Error: 400 messages. .content. : `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response. ``` The failure is not caused by user edits or by any client-side middleware. It appears to be Claude Code reconstructing/re-editing the conversation history in a way that desyncs an interleaved thinking block's signature, after which the API rejects the request. Once it starts, the session is effectively unusable: rewinding past the offending turn restores function only briefly before the next history-altering event re-triggers it. ## Environment - Claude Code: 2.1.148 (session is long-running and may span CC versions; please advise if a specific version range matters) - Model: `claude-opus-4-7[1m]` (the 1M-context variant) - Extended thinking: enabled, interleaved (multiple thinking blocks per assistant turn, interleaved with tool_use) - Thinking display: omitted — stored/returned thinking blocks have empty `thinking` text but a populated `signature` - Non-interactive/agentic usage with frequent tool calls, including parallel tool batches - Scheduled/cron-style prompts injected into the session on a timer ### Scale at which this manifested (the failure is scale-dependent) This did **not** appear at modest session sizes. It emerged only once the session reached an extreme, and worsened as it grew. At the point of persistent failure the session was: - **~382K input tokens per request** (live context window; well past the standard 200K, hence the `[1m]` variant) - **~329 MB** on-disk session transcript (`.jsonl`) - **~92,800 transcript events** - **~6,850 thinking blocks** accumulated over the session, many interleaved into individual assistant turns - **~7 weeks** of continuous session age (single resumed session, not a fresh one) - **~99% cache-read share** in steady state — i.e., the session was otherwise healthy and cache-efficient right up until a history-altering event tripped the 400 The failure rate and how quickly it recurs after a rewind both scale with this size — consistent with "any history re-edit has a high probability of touching one of the many signed thinking blocks now in context." Smaller/shorter sessions either don't hit it or recover for much longer between occurrences. ## Symptom - 400 referencing a `thinking`/`redacted_thinking` block in the **latest assistant message** (`messages. .content. `). - The coordinate is stable across retries while the history is unchanged, then shifts when the history changes (e.g., `messages.157.content.8`, later `messages.149.content.16` after a rewind/relaunch). - The referenced block is **not** the first thinking block in the turn (e.g., `content.8`, `content.16`) — earlier thinking blocks in the same request validate fine, which indicates one specific block in the turn is inconsistent rather than a systematic serialization problem. ## Triggers observed The error consistently appears immediately **after an event that forces Claude Code to rebuild/re-edit the assistant turn history**: 1. **A scheduled (cron-style) prompt firing into the session** — injecting a new turn, after which the next request 400s. This happens even when the session is otherwise idle. 2. **A parallel tool-call batch in which one tool errors** — the sibling tool calls are cancelled, and the next request 400s. In our captures, the errored call was a benign command failure (e.g., a CLI call against a ref that 40

claude-code2026-05-28 15:17:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

On a long-running Claude Code session using Opus 4.7 with extended (interleaved) thinking, the session reaches a state where it returns this API error on essentially every turn:

API Error: 400 messages.<N>.content.<M>: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

The failure is not caused by user edits or by any client-side middleware. It appears to be Claude Code reconstructing/re-editing the conversation history in a way that desyncs an interleaved thinking block's signature, after which the API rejects the request. Once it starts, the session is effectively unusable: rewinding past the offending turn restores function only briefly before the next history-altering event re-triggers it.

Error Message

API Error: 400 messages.<N>.content.<M>: thinking or redacted_thinking blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

Root Cause

Fix Action

Fix / Workaround

A long-running extended-thinking session becomes unusable: every turn 400s.
Rewind is only a temporary workaround; scheduled prompts in particular re-trigger it without any user action, so the session cannot be kept alive.
The only durable recovery is starting a fresh session, losing the working context.

Code Example

API Error: 400 messages.<N>.content.<M>: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

RAW_BUFFERClick to expand / collapse

Summary

On a long-running Claude Code session using Opus 4.7 with extended (interleaved) thinking, the session reaches a state where it returns this API error on essentially every turn:

API Error: 400 messages.<N>.content.<M>: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

Environment

Claude Code: 2.1.148 (session is long-running and may span CC versions; please advise if a specific version range matters)
Model: claude-opus-4-7[1m] (the 1M-context variant)
Extended thinking: enabled, interleaved (multiple thinking blocks per assistant turn, interleaved with tool_use)
Thinking display: omitted — stored/returned thinking blocks have empty thinking text but a populated signature
Non-interactive/agentic usage with frequent tool calls, including parallel tool batches
Scheduled/cron-style prompts injected into the session on a timer

Scale at which this manifested (the failure is scale-dependent)

This did not appear at modest session sizes. It emerged only once the session reached an extreme, and worsened as it grew. At the point of persistent failure the session was:

~382K input tokens per request (live context window; well past the standard 200K, hence the [1m] variant)
~329 MB on-disk session transcript (.jsonl)
~92,800 transcript events
~6,850 thinking blocks accumulated over the session, many interleaved into individual assistant turns
~7 weeks of continuous session age (single resumed session, not a fresh one)
~99% cache-read share in steady state — i.e., the session was otherwise healthy and cache-efficient right up until a history-altering event tripped the 400

The failure rate and how quickly it recurs after a rewind both scale with this size — consistent with "any history re-edit has a high probability of touching one of the many signed thinking blocks now in context." Smaller/shorter sessions either don't hit it or recover for much longer between occurrences.

Symptom

400 referencing a thinking/redacted_thinking block in the latest assistant message (messages.<N>.content.<M>).
The coordinate is stable across retries while the history is unchanged, then shifts when the history changes (e.g., messages.157.content.8, later messages.149.content.16 after a rewind/relaunch).
The referenced block is not the first thinking block in the turn (e.g., content.8, content.16) — earlier thinking blocks in the same request validate fine, which indicates one specific block in the turn is inconsistent rather than a systematic serialization problem.

Triggers observed

The error consistently appears immediately after an event that forces Claude Code to rebuild/re-edit the assistant turn history:

A scheduled (cron-style) prompt firing into the session — injecting a new turn, after which the next request 400s. This happens even when the session is otherwise idle.
A parallel tool-call batch in which one tool errors — the sibling tool calls are cancelled, and the next request 400s. In our captures, the errored call was a benign command failure (e.g., a CLI call against a ref that 404s); the resulting cancellation of the parallel siblings preceded the 400.

Both triggers reduce to the same operation: Claude Code reconstructing an assistant turn that contains interleaved thinking blocks.

What we ruled out

Not an intermediary. Reproduced with the client connecting directly to api.anthropic.com with no proxy or middleware in the request path. So this is not a third-party tool mangling the request body.
Not stored-block corruption that's user-visible. The stored thinking blocks are structurally well-formed (type, empty thinking, populated signature). The inconsistency is between the block's signature and its content/position after Claude Code re-edits the turn — not a malformed block on disk.
Not a single bad turn. Rewinding past the offending turn restores function, but the failure recurs within a short time on the next history-altering event. The failure surface scales with context size / number of interleaved thinking blocks, not with one specific turn.

Diagnostic observations

In steady state the session runs at ~99% cache-read share (stable prefix, cache working normally).
On the failing turn, the request simultaneously shows a cache miss (full prefix re-creation) and the thinking-block 400. Both are consistent with Claude Code rewriting the prompt prefix on that turn — the cache break and the signature desync are two symptoms of the same re-edit.
Larger contexts with more interleaved thinking blocks reach the failing state faster and recover for shorter periods, consistent with "any history re-edit has a high probability of touching one of many signed thinking blocks."

Hypothesis

When a history-altering event occurs (scheduled prompt injection, or reconstruction of a turn after parallel tool-call cancellation), Claude Code rebuilds the assistant message containing interleaved thinking blocks in a way that does not preserve a thinking block's original signed bytes/position. On the next request, the API validates the (now-inconsistent) thinking block against its signature and rejects with thinking blocks ... cannot be modified. The probability of hitting this scales with the number of interleaved thinking blocks in context, so long sessions degrade into a near-permanent failure state.

Impact

A long-running extended-thinking session becomes unusable: every turn 400s.
Rewind is only a temporary workaround; scheduled prompts in particular re-trigger it without any user action, so the session cannot be kept alive.
The only durable recovery is starting a fresh session, losing the working context.

What would help

When reconstructing/re-editing an assistant turn that contains thinking/redacted_thinking blocks (after tool-call cancellation, scheduled-prompt injection, compaction, or context editing), preserve those blocks byte-identical and in position, or drop them as a set per the API's interleaved-thinking rules — rather than emitting a turn whose thinking signatures no longer validate.
If a thinking block can't be preserved exactly during such a reconstruction, surface a clear local diagnostic (which block, which event caused it) instead of an opaque API 400.
Consider validating thinking-block signature consistency locally before send on long sessions, so the failure can be caught and self-healed (e.g., by dropping the unverifiable thinking block) rather than hard-failing every turn.

Reproduction sketch

Run a long-lived claude-opus-4-7[1m] session with extended/interleaved thinking, resumed/continued over a long period rather than fresh. Reproduction requires reaching the scale above — on the order of a multi-hundred-K-token live context (~382K observed), thousands of accumulated interleaved thinking blocks (~6,850 observed), and a session age measured in weeks. It will not reproduce on a small or short session.
Drive frequent agentic tool use, including parallel tool batches where one call can fail; and/or inject scheduled prompts on a timer.
Once at that scale, a history-altering event (cancelled parallel batch, or scheduled prompt) produces the messages.<N>.content.<M>: thinking blocks ... cannot be modified 400 on the next request, then recurs on subsequent history-altering events — quickly enough that rewinding no longer keeps the session usable.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering