claude-code - 💡(How to fix) Fix Brief mid-response token incoherence with self-recovery (Opus 4.6 1M, extended session) [6 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#46141Fetched 2026-04-11 06:27:59
View on GitHub
Comments
6
Participants
3
Timeline
10
Reactions
1
Timeline (top)
commented ×6labeled ×4

During an extended interactive coding session, the model produced a brief burst of incoherent output in the middle of a response (immediately after two parallel Read tool calls returned large memory files), then self-recovered within the same response with a meta comment ("Wait — I'm generating text noise. Let me stop and continue the actual task cleanly.") and completed the task normally.

No actual side effects on the work — the subsequent tool calls and file writes all executed correctly. This is a model output quality issue, not a correctness/tool issue. Reporting observationally so the team has a concrete example.

Error Message

  1. Pattern resembles Elixir atom syntax (:ok, :c5) fragments fused with number suffixes and short subword pieces. The session had been heavily working with Elixir code (provider behaviour / error atoms), and the immediately preceding context was two Korean-language memory files containing Elixir code snippets.

Root Cause

During an extended interactive coding session, the model produced a brief burst of incoherent output in the middle of a response (immediately after two parallel Read tool calls returned large memory files), then self-recovered within the same response with a meta comment ("Wait — I'm generating text noise. Let me stop and continue the actual task cleanly.") and completed the task normally.

No actual side effects on the work — the subsequent tool calls and file writes all executed correctly. This is a model output quality issue, not a correctness/tool issue. Reporting observationally so the team has a concrete example.

Code Example

tsy
:ok, : Resaers, $16�3,cation," pr6, :C9(18$.mm
Ic) para, : j2, : c1\2,;

ments1, 2/

.
ic2,

, ice2, ization,

 ch15, : : : ixity32,

 ure :
chsedbookm1 (c4, ion0, sphere, ic8 1, : ixity3, 1,

 ation0, 2, : ic0, : pam1, : erk4, 3," ich1, ic4, : ivid1, 1, : c4, ics0, 1, : 1, : 1, :,

, : 4, : ic3, : c3, : : t3, : : 3, : 3, : 1, 4, : 3, : 2, : : lavish2, : en3 1, : 2, : c5, : : 1, : : : 2, : 2, : : : 3, :

WaitI'm generating text noise. Let me stop and continue the actual task cleanly.

메모리 갱신 진행. Slice 1B 리뷰 반영 이력 신규 파일 작성 + `project_billing_design.md` 진행 상태 업데이트.
RAW_BUFFERClick to expand / collapse

Environment

  • Claude Code version: 2.1.100
  • Model: claude-opus-4-6[1m] (Opus 4.6, 1M context variant)
  • Platform: macOS (Darwin 25.3.0, arm64)
  • Shell: zsh
  • Session type: interactive, extended (~1–1.5 hours of cumulative work), heavy tool use

Summary

During an extended interactive coding session, the model produced a brief burst of incoherent output in the middle of a response (immediately after two parallel Read tool calls returned large memory files), then self-recovered within the same response with a meta comment ("Wait — I'm generating text noise. Let me stop and continue the actual task cleanly.") and completed the task normally.

No actual side effects on the work — the subsequent tool calls and file writes all executed correctly. This is a model output quality issue, not a correctness/tool issue. Reporting observationally so the team has a concrete example.

Observed output (verbatim)

The model's next text block was intended to begin a memory update task (writing a summary in Korean). Instead it emitted the following:

tsy
:ok, : Resaers, $16�3,cation," pr6, :C9(18$.mm
Ic) para, : j2, : c1\2,;

ments1, 2/

.
ic2,

, ice2, ization,

 ch15, : : : ixity32,

 ure :
chsedbookm1 (c4, ion0, sphere, ic8 1, : ixity3, 1,

 ation0, 2, : ic0, : pam1, : erk4, 3," ich1, ic4, : ivid1, 1, : c4, ics0, 1, : 1, : 1, :,

, : 4, : ic3, : c3, : : t3, : : 3, : 3, : 1, 4, : 3, : 2, : : lavish2, : en3 1, : 2, : c5, : : 1, : : : 2, : 2, : : : 3, :

Wait — I'm generating text noise. Let me stop and continue the actual task cleanly.

메모리 갱신 진행. Slice 1B 리뷰 반영 이력 신규 파일 작성 + `project_billing_design.md` 진행 상태 업데이트.

Notable details:

  1. U+FFFD replacement character appears mid-string ($16�3) — suggests broken UTF-8 byte sequence somewhere in the generated token stream.
  2. Pattern resembles Elixir atom syntax (:ok, :c5) fragments fused with number suffixes and short subword pieces. The session had been heavily working with Elixir code (provider behaviour / error atoms), and the immediately preceding context was two Korean-language memory files containing Elixir code snippets.
  3. The model continued emitting fragments until it "noticed" and produced an English meta sentence breaking out of the loop, then switched back to Korean and proceeded normally.
  4. No tool call errors or system errors were observed in the transcript — this was pure text-layer incoherence.

Context immediately preceding the glitch

The glitch occurred immediately after a single turn with two parallel Read tool calls:

  • memory/project_billing_design.md — 459 lines
  • memory/project_billing_slice_1a_review.md — 91 lines

Both files contain mostly Korean prose with embedded Elixir code blocks (behaviour definitions, Ecto schemas, @callback typespecs, :atom enumerations). Combined payload was several thousand tokens of mixed Korean/English/Elixir.

Session state at the time:

  • Duration: ~1–1.5 hours of cumulative work (PR review → fix → commit → push → merge → memory update workflow)
  • Cumulative tool calls: ~80+ including many Read, Edit, Write, Bash, TaskCreate/TaskUpdate
  • Recent activity (last ~10 turns): 7 sequential commits with git commit -F /tmp/claude/*.txt, git push, PR creation via MCP, PR merge, local main sync with chflags sandbox-disable, branch cleanup
  • Active language mode: Korean (output style Korean CoT, also "caveman full" persist from prior session per memory)

Expected behavior

The model should have begun the memory update task with a normal Korean sentence (something like "메모리 갱신 진행합니다 — project_billing_slice_1b_review.md 신규 파일 작성 + ..."), matching the flow of the rest of the response.

Self-recovery pattern (interesting)

The recovery is noteworthy:

  1. Incoherent fragments for ~15 "paragraphs"
  2. Sudden shift to coherent English: "Wait — I'm generating text noise. Let me stop and continue the actual task cleanly."
  3. Next line: proper Korean task description
  4. Remainder of the response completed all intended work (file writes, HANDOFF.md creation) without further issues

It's unclear whether this recovery is a trained behavior (self-monitoring) or a lucky escape from a degenerate state.

Reproduction notes

I cannot provide a deterministic reproduction. Sampling is stochastic; the exact same input likely won't reproduce it. But the environmental factors that may be relevant:

  • Long session (large KV cache)
  • Heavy alternation between Korean and English modalities
  • Large Read tool result of mixed-language content with embedded code
  • Opus 4.6 1M context variant specifically (unknown whether 200k would show same)

If the Anthropic team has access to the trace for this session, the turn in question can probably be located by searching for the literal substring "tsy\n:ok, : Resaers" in the model output logs (guaranteed-unique prefix).

Impact

  • Functional: none. All tool calls after the glitch succeeded; the user-visible work (PR merge, memory files, HANDOFF.md) is correct.
  • Trust: moderate. In a less-recoverable variant, a garbled response mid-commit or mid-edit could have inserted corrupt content into a file. The self-recovery worked here but should not be relied on.
  • Observability: the user noticed and asked about it, triggering this report. A quieter variant (e.g. only a few corrupted tokens inside an otherwise valid response) could easily be missed.

Asks

  1. Please investigate whether Opus 4.6 1M shows higher incoherence risk in long sessions with heavy Korean/English/code interleaving.
  2. Consider whether there are telemetry signals (e.g. logit entropy spikes, repeat-token ratios) that could detect this class of degradation in-flight.
  3. If a fix is not imminent, consider documenting the self-recovery meta-sentence behavior — users may want to configure retry/rollback policies around it.

Happy to provide any additional session details on request.

extent analysis

TL;DR

The model's incoherent output may be related to the large payload of mixed Korean/English/Elixir content from the parallel Read tool calls, and mitigating this issue could involve adjusting the model's handling of such inputs.

Guidance

  • Investigate the model's behavior when handling large payloads with mixed language and code content to identify potential causes of incoherence.
  • Consider implementing telemetry signals to detect degradation in-flight, such as logit entropy spikes or repeat-token ratios.
  • Review the model's self-recovery behavior and determine whether it can be relied upon or if additional measures are needed to prevent corrupt content insertion.
  • Examine the Opus 4.6 1M context variant's performance in long sessions with heavy Korean/English/code interleaving to identify potential areas for improvement.

Example

No specific code snippet is provided, as the issue is related to the model's behavior and output rather than a specific code implementation.

Notes

The exact cause of the incoherence is unclear, and reproducing the issue may be challenging due to the stochastic nature of the sampling process. However, by investigating the model's behavior and implementing telemetry signals, it may be possible to identify and mitigate similar issues in the future.

Recommendation

Apply workaround: Implement additional logging and monitoring to detect potential incoherence issues, and consider developing a retry/rollback policy to handle cases where the model's self-recovery behavior is not reliable. This will help to prevent corrupt content insertion and improve the overall reliability of the model.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The model should have begun the memory update task with a normal Korean sentence (something like "메모리 갱신 진행합니다 — project_billing_slice_1b_review.md 신규 파일 작성 + ..."), matching the flow of the rest of the response.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING