claude-code - 💡(How to fix) Fix [Bug] Claude Code generates mixed CJK characters in Korean text instead of pure Hangul

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

Impact

  • File lookup failures when CJK characters appear in filenames or task descriptions by Claude code.
  • Unnecessary data generation and history twisting occur because additional items to be updated in the same file are newly created with a file name containing Chinese characters.
  • Spend a lot of time manually and verifying content to solve this problem.
  • Even if I ask Claude code to restore or partially fix the problem instead of manually, Claude code is not able to distinguish between the original file and the Chinese character file, and have more persistent problems while trying to solve it.
  • Significant token and time waste as Claude must re-search for files it cannot find
  • Additional token consumption when hooks or prompts attempt to correct the output
  • Repeated corrections to CLAUDE.md , rules, memory md, other md files and context windows more than 20 times have had no lasting effect
  • The model verbally agrees to stop using CJK characters, then repeats the behavior in the very next response
RAW_BUFFERClick to expand / collapse

Bug Description Hello Anthropic team,

I am a Claude Code user writing to report a persistent and resource-intensive bug.

Issue Summary Claude Code repeatedly outputs mixed CJK (Chinese) characters within Korean text — specifically, it writes '분析' (The last word is Chinese) or sometimes '分析' (All word is Chinese) instead of '분석' (pure Korean Hangul) for the word meaning 'analysis.'

Impact

  • File lookup failures when CJK characters appear in filenames or task descriptions by Claude code.
  • Unnecessary data generation and history twisting occur because additional items to be updated in the same file are newly created with a file name containing Chinese characters.
  • Spend a lot of time manually and verifying content to solve this problem.
  • Even if I ask Claude code to restore or partially fix the problem instead of manually, Claude code is not able to distinguish between the original file and the Chinese character file, and have more persistent problems while trying to solve it.
  • Significant token and time waste as Claude must re-search for files it cannot find
  • Additional token consumption when hooks or prompts attempt to correct the output
  • Repeated corrections to CLAUDE.md , rules, memory md, other md files and context windows more than 20 times have had no lasting effect
  • The model verbally agrees to stop using CJK characters, then repeats the behavior in the very next response

Root Cause Assessment This appears to be a model weight issue — a deeply embedded hanja(Chinese)-hangul(Korean) mixed pattern from training data — rather than a prompt-level problem. No amount of prompt engineering fully resolves it.

What I have tried

  • Explicit prohibitions in CLAUDE.md (repeated, in multiple forms)
  • Context window instructions
  • Writing not using Chinese chacters and prohibited text in mdc file of rules
  • PostToolUse hooks to block CJK character output
  • Git pre-commit hooks
  • Requesting correction 10+ times within the same session

None of these fully prevent the behavior at the generation level, meaning token waste occurs regardless.

Request Please address this at the model level — either through fine-tuning, output filtering, or logit suppression for CJK characters when the user's language context is Korean. This is causing real, measurable resource waste for Korean-language Claude Code users.

Thank you for your attention to this issue.

Environment Info

  • Platform: win32
  • Terminal: cursor
  • Version: 2.1.131
  • Feedback ID: 19b9468d-7b05-4ae3-8ded-60c63dc5bd0c

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING