claude-code - 💡(How to fix) Fix Lone UTF-16 surrogate in assistant output bricks session with 400 'no low surrogate'

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

A single unpaired UTF-16 surrogate code unit emitted by the model gets persisted into the project JSONL session file, after which every subsequent turn in that session fails with:

API Error: 400 The request body is not valid JSON: no low surrogate in string: line 1 column 138398 (char 138397)

The only recovery is /clear — but a non-technical user has no way to know that, and may lose hours of context.

Error Message

API Error: 400 The request body is not valid JSON: no low surrogate in string: line 1 column 138398 (char 138397)

Root Cause

/clear is the only known recovery. /compact does not help because it re-serializes the broken history.

Fix Action

Fix / Workaround

Workaround for users who hit this

Code Example

API Error: 400 The request body is not valid JSON: no low surrogate in string: line 1 column 138398 (char 138397)

---

"스킬·백트래\ud82d·깊이 한도"

---

s.replace(/[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?<![\uD800-\uDBFF])[\uDC00-\uDFFF]/g, '�')
RAW_BUFFERClick to expand / collapse

Summary

A single unpaired UTF-16 surrogate code unit emitted by the model gets persisted into the project JSONL session file, after which every subsequent turn in that session fails with:

API Error: 400 The request body is not valid JSON: no low surrogate in string: line 1 column 138398 (char 138397)

The only recovery is /clear — but a non-technical user has no way to know that, and may lose hours of context.

Repro (observed in the wild)

Claude Code (Opus 4.7, 1M-context build, macOS, 2026-05-21) emitted the following inside an AskUserQuestion option description:

"스킬·백트래\ud82d·깊이 한도"

The intended Korean word was 백트래킹 ("backtracking"). The model's token stream broke a surrogate pair: \uD82D was emitted with no following low-surrogate (\uDC00–\uDFFF), so the resulting UTF-16 sequence is invalid.

The lone surrogate was written verbatim into the project JSONL (verified at ~/.claude/projects/<proj>/<session>.jsonl line 53 and 54, identical offset, byte 787 / 995). From that point on the client kept replaying the full history to the Anthropic API, which rejects the body at JSON-parse time. Every retry produced the same 400 with the same column number.

Why this is sharp

  1. Silent corruption. The bad char is one codepoint; the user can't see it in any UI.
  2. Session is permanently bricked. Adding a new user message doesn't help — the broken history goes out on every request.
  3. Error message points at the wrong layer. column 138398 looks like a serializer bug, not "a model produced an invalid surrogate 30 messages ago."
  4. Cache is also poisoned. Prompt-cache hits keep replaying the same broken prefix.

Suggested client-side fix

The upstream model decoder occasionally leaking an unpaired surrogate is hard to eliminate. The CLI can make this non-fatal in two cheap places:

  1. Before persisting assistant text to JSONL, run a surrogate-pair scrubber:

    s.replace(/[\uD800-\uDBFF](?![\uDC00-\uDFFF])|(?<![\uD800-\uDBFF])[\uDC00-\uDFFF]/g, '�')

    Replace with U+FFFD (or drop) — either is recoverable; the current behavior is not.

  2. Before constructing the API request body from history, run the same scrub as defense-in-depth. This rescues already-broken sessions on first retry, instead of forcing /clear.

  3. Telemetry-only: when the scrubber fires, emit a structured event with the model id and a short context window. This keeps the upstream decoder bug visible to Anthropic without surfacing scary warnings to users.

Environment

  • Claude Code: latest as of 2026-05-21
  • Model: claude-opus-4-7 (1M context build)
  • OS: macOS 25.3.0 (darwin)
  • Shell: zsh
  • Tool that emitted the bad char: AskUserQuestion (within an option description field)

Workaround for users who hit this

/clear is the only known recovery. /compact does not help because it re-serializes the broken history.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Lone UTF-16 surrogate in assistant output bricks session with 400 'no low surrogate'