claude-code - 💡(How to fix) Fix [MODEL] Korean output corruption — words replaced with "영역" token in long responses

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Code Example

Not applicable — this is an output text issue. No files were modified incorrectly.

---

Example of corrupted output (verbatim):

  > "영역 영역 = 영역 영역 README 영역 영역 영역 영역 영역 영역 영역 영역 영역 영역 영역."

  Intended meaning was approximately:
  "This task means a full restructure of all READMEs."

  Another example (table cell, verbatim):

  | 영역 | 영역 |
  |---|---|
  | 영역 영역 영역 영역 | 영역 영역 영역 영역 영역 영역 영역 영역 |

  After the user pointed out the corruption, Claude self-diagnosed:
  "Tables and long Korean paragraphs corrupt more often. Short responses and code blocks rarely corrupt."

  However, the very next response again used a large table with long Korean cells — Claude failed to apply its own diagnosis.
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues for similar behavior reports
  • This report does NOT contain sensitive information (API keys, passwords, etc.)

Type of Behavior Issue

Other unexpected behavior

What You Asked Claude to Do

I asked Claude in Korean to perform various coding tasks across a session (restructuring README files, defining new EAS scripts in package.json, summarizing a list of queued tasks in a table, etc.). Most prompts were in Korean.

What Claude Actually Did

In multiple responses, Claude produced Korean text where many content words were replaced with the literal string "영역" (the Korean word for "area"/"region"). The replacement appeared in:

  1. Table body cells — most cells contained only "영역 영역 영역"
  2. Long Korean paragraphs — multiple consecutive words replaced
  3. Sometimes entire sentences became "영역 영역 = 영역 영역 영역 영역 영역 영역 영역 영역"

Claude did not detect the corruption while emitting it. The user had to point it out each time, then Claude rewrote the response in correct Korean.

Expected Behavior

Claude should emit coherent Korean text. The string "영역" should only appear when it is the actually intended word (rare in normal conversation), not as a placeholder/token substitute for other Korean words.

Files Affected

Not applicable — this is an output text issue. No files were modified incorrectly.

Permission Mode

Accept Edits was ON (auto-accepting changes)

Can You Reproduce This?

Yes, every time with the same prompt

Steps to Reproduce

  1. Start a Claude Code session with Korean as the primary user language
  2. Ask Claude (in Korean) to produce a response that contains:
    • Markdown tables with Korean text in body cells, OR
    • Long Korean paragraphs (multiple sentences)
  3. The "영역" substitution appears with high probability
  4. Short Korean responses, English text, code blocks, and identifiers rarely corrupt — the issue is concentrated in long Korean prose / table cells

Claude Model

Opus

Relevant Conversation

Example of corrupted output (verbatim):

  > "영역 영역 = 영역 영역 README 영역 영역 영역 영역 영역 영역 영역 영역 영역 영역 영역."

  Intended meaning was approximately:
  "This task means a full restructure of all READMEs."

  Another example (table cell, verbatim):

  | 영역 | 영역 |
  |---|---|
  | 영역 영역 영역 영역 | 영역 영역 영역 영역 영역 영역 영역 영역 |

  After the user pointed out the corruption, Claude self-diagnosed:
  "Tables and long Korean paragraphs corrupt more often. Short responses and code blocks rarely corrupt."

  However, the very next response again used a large table with long Korean cells — Claude failed to apply its own diagnosis.

Impact

Critical - Data loss or corrupted project

Claude Code Version

2.1.146 (Claude Code)

Platform

Anthropic API

Additional Context

Observed patterns:

  • Triggers more often with: tables (especially body cells), long Korean paragraphs
  • Rarely affects: English, identifiers, file paths, code blocks, short 1-2 sentence Korean responses
  • Recurs within the same session — 5+ times in this session
  • Model cannot self-detect during emission

Self-diagnosis failure: After analyzing the pattern itself ("tables and long paragraphs corrupt more often"), Claude immediately produced another corrupted large table in the next response — indicating the model cannot act on its own behavioral analysis even within the same session.

OS: Windows 11 Pro 10.0.26200 Shell: PowerShell

<img width="1773" height="770" alt="Image" src="https://github.com/user-attachments/assets/70b1aad0-e0ec-4ca4-b5e2-87aab26f2789" />

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [MODEL] Korean output corruption — words replaced with "영역" token in long responses