claude-code - 💡(How to fix) Fix [BUG] Model drifts full-width CJK punctuation to half-width in Edit's old_string, causing silent "String to replace not found" [1 comments, 2 participants]

carrotRakko · 2026-04-23T17:47:44Z

[claude-code] What's Wrong? When editing a file that contains full-width CJK punctuation e.g. （ , ） , 、 , 。 , ： , ； , the Edit tool frequently fails with Strin… ## Fix / Workaround - **Verification that `Edit` is innocent**: I ran a small experiment with short test strings (`全角（カッコ）テスト`). `Edit` matched byte-for-byte when `old_string` contained full-width `（）`, and correctly failed when `old_string` used half-width `()` while the file had full-width. The tool does no normalization — the issue is entirely upstream in the model output. - **Workaround**: For bulk updates I gave up on having the model re-emit Japanese-laden `old_string` verbatim. Instead I wrote a small Python script that does line-level regex replacement (e.g. `^| item \|[^\n]*$`) with only the new content. This bypasses the drift since the original line never needs to be regenerated. - **Observed affected punctuation** (same drift direction as #50975): - `（` U+FF08 ↔ `(` U+0028 - `）` U+FF09 ↔ `)` U+0029 - `：` U+FF1A ↔ `:` U+003A - Likely also `、。；？！` based on priors, though I only verified `（）：` directly. - **Unaffected**: `「」『』`, per #50975's observation. - **User-visible impact**: Any workflow that mixes Japanese (or Chinese/Korean) prose with English identifiers — technical writing, PRDs, LaTeX sources, localized docs — hits this often enough that `Edit` becomes unreliable without a workaround. ## What's Wrong? When editing a file that contains full-width CJK punctuation (e.g. `（`, `）`, `、`, `。`, `：`, `；`), the `Edit` tool frequently fails with `String to replace not found` — even though the content visibly matches. The root cause is **not the `Edit` tool itself**, which byte-matches correctly. It is the **model silently generating half-width ASCII punctuation (`(`, `)`, `,`, `.`, `:`, `;`) when it intended full-width**, producing an `old_string` whose bytes do not match the file. This is especially destructive because: - The model's self-report is "I'm writing full-width punctuation" — it doesn't know it drifted. - Debugging looks like a tool bug: the user sees a mismatch between the visible text and the reported `old_string`, and assumes `Edit` has a normalization pass. - The drift is context-dependent: short pure-Japanese sentences tend to keep full-width; long technical paragraphs that interleave Japanese with English tech terms drift to half-width at a much higher rate. - It is specific to *generation* in the `old_string` parameter. Characters copied via a Python helper using `\uff08` / `\uff09` escapes reproduce fine; characters generated token-by-token as part of a long Japanese sentence do not. The same drift direction exists in #50975 but that is a **`Write`-tool bug** (tool silently half-widthing on overwrite). This report is a separate issue about **model-level drift affecting `Edit`'s `old_string`**, not tool behavior. #50975's author initially conflated the two and then corrected themselves in a follow-up comment, which is the exact same misdiagnosis path users will walk on first encounter. ## What Should Happen? The model should generate the full-width punctuation it intends. When the target file contains `（）` (U+FF08 / U+FF09), the generated `old_string` should contain `（）`, not `()` (U+0028 / U+0029). At minimum, the model's self-narration and its `old_string` bytes should agree. ## Error Messages/Logs ``` String to replace not found in file. String: | 項目 | 未 | — | 説明（前提条件）と技術用語（ID / 状態 / 時刻）が交じる行 | ``` The file actually contained full-width `（` (U+FF08) and `）` (U+FF09), but the model emitted the half-width `(` / `)` you see above while claiming to output full-width. `Edit` correctly rejected the mismatch. ## Steps to Reproduce 1. Create a file with a line that mixes Japanese prose with English identifiers, using full-width CJK parentheses in the Japanese prose. Example line (ASCII `()` below are placeholders — substitute U+FF08 / U+FF09 so the file actually contains full-width): ``` | item | 未 | — | 説明(前提条件)と技術用語(id / state / timestamp)が交じる行 | ``` 2. Ask Claude Code to edit that row: `Edit old_string= `, `new_string= `. 3. Observe that the model's generated `old_string` contains half-width `()` instead of the file's full-width `（）`, producing `String to replace not found`. 4. Verify by inspecting the bytes the model actually emitted (e.g. by instrumenting the transport, or by running an Edit with `old_string` constructed via Python `\uFF08` / `\uFF09` escapes — that one matches correctly). Reproduction is probabilistic — the shorter and more purely Japanese the text, the more reliably the model emits full-width; the longer and more technical (with English identifiers interleaved), the more reliably it drifts. ## Additional Information - **Verification that `Edit` is innocent**: I ran a small experiment with short test strings (`全角（カッコ）テスト`). `Edit` matched byte-for-byte when `old_string` contained full-width `（）`, and correctly failed when `old_string` used half-widt

claude-code2026-04-23 17:47:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#52482•Fetched 2026-04-24 06:06:01

View on GitHub

Comments

Participants

Timeline

Reactions

Author

carrotRakko

Participants

carrotRakko

github-actions[bot]

Timeline (top)

labeled ×4commented ×1mentioned ×1subscribed ×1

Error Message

Error Messages/Logs

Root Cause

When editing a file that contains full-width CJK punctuation (e.g. （, ）, 、, 。, ：, ；), the Edit tool frequently fails with String to replace not found — even though the content visibly matches. The root cause is not the Edit tool itself, which byte-matches correctly. It is the model silently generating half-width ASCII punctuation ((, ), ,, ., :, ;) when it intended full-width, producing an old_string whose bytes do not match the file.

Fix Action

Fix / Workaround

Verification that Edit is innocent: I ran a small experiment with short test strings (全角（カッコ）テスト). Edit matched byte-for-byte when old_string contained full-width （）, and correctly failed when old_string used half-width () while the file had full-width. The tool does no normalization — the issue is entirely upstream in the model output.
Workaround: For bulk updates I gave up on having the model re-emit Japanese-laden old_string verbatim. Instead I wrote a small Python script that does line-level regex replacement (e.g. ^| item \|[^\n]*$) with only the new content. This bypasses the drift since the original line never needs to be regenerated.
Observed affected punctuation (same drift direction as #50975):
- （ U+FF08 ↔ ( U+0028
- ） U+FF09 ↔ ) U+0029
- ： U+FF1A ↔ : U+003A
- Likely also 、。；？！ based on priors, though I only verified （）： directly.
Unaffected: 「」『』, per #50975's observation.
User-visible impact: Any workflow that mixes Japanese (or Chinese/Korean) prose with English identifiers — technical writing, PRDs, LaTeX sources, localized docs — hits this often enough that Edit becomes unreliable without a workaround.

Code Example

<tool_use_error>String to replace not found in file.
String: | 項目 | 未 | — | 説明（前提条件）と技術用語（ID / 状態 / 時刻）が交じる行 |
</tool_use_error>

---

| item | 未 | — | 説明(前提条件)と技術用語(id / state / timestamp)が交じる行 |

RAW_BUFFERClick to expand / collapse

What's Wrong?

This is especially destructive because:

The model's self-report is "I'm writing full-width punctuation" — it doesn't know it drifted.
Debugging looks like a tool bug: the user sees a mismatch between the visible text and the reported old_string, and assumes Edit has a normalization pass.
The drift is context-dependent: short pure-Japanese sentences tend to keep full-width; long technical paragraphs that interleave Japanese with English tech terms drift to half-width at a much higher rate.
It is specific to generation in the old_string parameter. Characters copied via a Python helper using \uff08 / \uff09 escapes reproduce fine; characters generated token-by-token as part of a long Japanese sentence do not.

The same drift direction exists in #50975 but that is a Write-tool bug (tool silently half-widthing on overwrite). This report is a separate issue about model-level drift affecting Edit's old_string, not tool behavior. #50975's author initially conflated the two and then corrected themselves in a follow-up comment, which is the exact same misdiagnosis path users will walk on first encounter.

What Should Happen?

The model should generate the full-width punctuation it intends. When the target file contains （） (U+FF08 / U+FF09), the generated old_string should contain （）, not () (U+0028 / U+0029). At minimum, the model's self-narration and its old_string bytes should agree.

Error Messages/Logs

<tool_use_error>String to replace not found in file.
String: | 項目 | 未 | — | 説明（前提条件）と技術用語（ID / 状態 / 時刻）が交じる行 |
</tool_use_error>

The file actually contained full-width （ (U+FF08) and ） (U+FF09), but the model emitted the half-width ( / ) you see above while claiming to output full-width. Edit correctly rejected the mismatch.

Steps to Reproduce

Create a file with a line that mixes Japanese prose with English identifiers, using full-width CJK parentheses in the Japanese prose. Example line (ASCII () below are placeholders — substitute U+FF08 / U+FF09 so the file actually contains full-width):
```
| item | 未 | — | 説明(前提条件)と技術用語(id / state / timestamp)が交じる行 |
```
Ask Claude Code to edit that row: Edit old_string=<row verbatim>, new_string=<anything>.
Observe that the model's generated old_string contains half-width () instead of the file's full-width （）, producing String to replace not found.
Verify by inspecting the bytes the model actually emitted (e.g. by instrumenting the transport, or by running an Edit with old_string constructed via Python \uFF08 / \uFF09 escapes — that one matches correctly).

Reproduction is probabilistic — the shorter and more purely Japanese the text, the more reliably the model emits full-width; the longer and more technical (with English identifiers interleaved), the more reliably it drifts.

Additional Information

Verification that Edit is innocent: I ran a small experiment with short test strings (全角（カッコ）テスト). Edit matched byte-for-byte when old_string contained full-width （）, and correctly failed when old_string used half-width () while the file had full-width. The tool does no normalization — the issue is entirely upstream in the model output.
Workaround: For bulk updates I gave up on having the model re-emit Japanese-laden old_string verbatim. Instead I wrote a small Python script that does line-level regex replacement (e.g. ^| item \|[^\n]*$) with only the new content. This bypasses the drift since the original line never needs to be regenerated.
Observed affected punctuation (same drift direction as #50975):
- （ U+FF08 ↔ ( U+0028
- ） U+FF09 ↔ ) U+0029
- ： U+FF1A ↔ : U+003A
- Likely also 、。；？！ based on priors, though I only verified （）： directly.
Unaffected: 「」『』, per #50975's observation.
User-visible impact: Any workflow that mixes Japanese (or Chinese/Korean) prose with English identifiers — technical writing, PRDs, LaTeX sources, localized docs — hits this often enough that Edit becomes unreliable without a workaround.

Related but distinct: #50975 (Write silently half-widths on overwrite).

✍️ Author: Claude Code with @carrotRakko (AI-written, human-approved)

extent analysis

TL;DR

The model should be adjusted to correctly generate full-width CJK punctuation in the old_string parameter to match the intended output.

Guidance

Verify the model's output by inspecting the bytes emitted for old_string to confirm the presence of half-width punctuation instead of full-width.
Use a Python script with regex replacement as a temporary workaround to bypass the model's drift issue.
Test the model with short, purely Japanese sentences to observe the reliable emission of full-width punctuation, and longer technical paragraphs to see the drift to half-width.
Check for other affected punctuation characters, such as ： (U+FF1A), 、 (U+FF0C), 。 (U+3002), ； (U+3001), in addition to （ (U+FF08) and ） (U+FF09).

Example

No code snippet is provided as the issue is related to the model's output, but a Python script using \uFF08 and \uFF09 escapes can be used to construct old_string and verify the correct output.

Notes

The issue is specific to the model's generation of old_string and does not affect the Edit tool itself. The drift is context-dependent and more pronounced in longer technical paragraphs with interleaved English identifiers.

Recommendation

Apply a workaround using a Python script with regex replacement until the model is adjusted to correctly generate full-width CJK punctuation. This will ensure reliable output and prevent errors when using the Edit tool.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#prompt formatting #chain error #conversation history #tool integration #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] Model drifts full-width CJK punctuation to half-width in Edit's old_string, causing silent "String to replace not found" [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

Fix Action

Fix / Workaround

Code Example

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Model drifts full-width CJK punctuation to half-width in Edit's old_string, causing silent "String to replace not found" [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

Fix Action

Fix / Workaround

Code Example

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING