claude-code - 💡(How to fix) Fix Korean characters corrupted during streaming (U+FFFD replacement) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#47013Fetched 2026-04-13 05:43:49
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
labeled ×5commented ×1

Korean (Hangul) characters are intermittently corrupted during response streaming, appearing as U+FFFD (REPLACEMENT CHARACTER) or diamond symbols (◆) in the terminal. The corruption is already present in the session JSONL log, confirming this is not a terminal rendering issue.

Root Cause

Root Cause Hypothesis

Code Example

U+BD10U+B4DCU+B9ACU+ACA0U+FFFDU+FFFDU+FFFDU+B2C8U+B2E4
RAW_BUFFERClick to expand / collapse

Description

Korean (Hangul) characters are intermittently corrupted during response streaming, appearing as U+FFFD (REPLACEMENT CHARACTER) or diamond symbols (◆) in the terminal. The corruption is already present in the session JSONL log, confirming this is not a terminal rendering issue.

Evidence

Case 1: stock-analysis PM session

  • Expected: 봐드리겠습니다
  • Actual: 봐드리겠���니다 (3x U+FFFD replacing )
  • Session log hex dump confirms U+FFFD (not font fallback):
U+BD10 봐
U+B4DC 드
U+B9AC 리
U+ACA0 겠
U+FFFD �
U+FFFD �
U+FFFD �
U+B2C8 니
U+B2E4 다

Case 2: another session

  • Expected: 걸러야하니까
  • Actual: 걸���야 ��니까

Root Cause Hypothesis

The Korean character (U+C2B5) is encoded as 3 bytes in UTF-8: EC 8A B5. When the streaming response splits a chunk boundary in the middle of a multi-byte UTF-8 sequence, each orphaned byte is decoded as U+FFFD. Three replacement characters for one 3-byte Korean character confirms a byte-level split.

Environment

  • Claude Code: latest (via claude CLI)
  • OS: macOS 15.4.1 (Darwin 25.3.0)
  • Terminal: Ghostty 1.3.1 (ruled out as cause — corruption exists in session JSONL)
  • Model: Claude Opus 4.6, Claude Sonnet 4.6
  • Language: Korean (Hangul)

Reproduction

  • Occurs intermittently during longer Korean-language responses
  • More likely to occur mid-sentence on common Hangul syllables
  • Not specific to any particular character — any multi-byte Korean char can be affected

Related

  • Possibly related to #46471 (Enclosed Alphanumerics rendering), but this is a distinct UTF-8 streaming issue, not a font/glyph problem

Co-Authored-By: Claude Opus 4.6 [email protected]

extent analysis

TL;DR

Ensure proper handling of multi-byte UTF-8 sequences during response streaming to prevent character corruption.

Guidance

  • Verify that the streaming response is properly handling chunk boundaries to avoid splitting multi-byte UTF-8 sequences.
  • Review the code responsible for encoding and decoding UTF-8 characters to ensure it correctly handles sequences that span chunk boundaries.
  • Consider implementing a mechanism to detect and reassemble split UTF-8 sequences during decoding.
  • Test the fix with a variety of Korean characters and sentence structures to ensure the issue is fully resolved.

Example

No specific code example can be provided without more context, but the fix will likely involve modifying the UTF-8 encoding and decoding logic in the response streaming code.

Notes

The root cause appears to be related to the handling of multi-byte UTF-8 sequences during response streaming, but the exact fix will depend on the specifics of the implementation.

Recommendation

Apply a workaround to handle multi-byte UTF-8 sequences properly, such as buffering and reassembling sequences that span chunk boundaries, as upgrading to a fixed version is not mentioned as an option.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING