claude-code - 💡(How to fix) Fix Opus 4.7 (1M): Parallel Explore sub-agents drift from Japanese to Korean in long Markdown output

StepCodex · 2026-05-16T03:59:33Z

[claude-code] Two parallel Explore sub-agents both spawned from the same parent Opus 4.7 1M main session, same instant, with prompts written entirely in Japane… Two parallel `Explore` sub-agents (both spawned from the same parent Opus 4.7 (1M) main session, same instant, with prompts written entirely in Japanese) produced long-form Markdown reports where the **second half of the output drifted from Japanese into Korean**, with words and sentences mixed in a single line (e.g., `상태遷移`, `자동으로`, `부작용`). The Korean text is semantically the *translation* of the Japanese text the agent intended to write — not random or malicious content. This is reproducible-enough to be concerning: **both** parallel agents exhibited the same drift pattern at the same point in their respective outputs. ## Fix / Workaround 1. Acknowledge if this is a known pattern with Opus 4.7 (1M) 2. If not known: any guidance on how to capture more diagnostic data if it reoccurs (full raw response, tokenizer trace, etc.)? 3. Any workaround beyond "shorter outputs" (e.g., explicit language lock in agent prompt, model parameter)? ## Summary Two parallel `Explore` sub-agents (both spawned from the same parent Opus 4.7 (1M) main session, same instant, with prompts written entirely in Japanese) produced long-form Markdown reports where the **second half of the output drifted from Japanese into Korean**, with words and sentences mixed in a single line (e.g., `상태遷移`, `자동으로`, `부작용`). The Korean text is semantically the *translation* of the Japanese text the agent intended to write — not random or malicious content. This is reproducible-enough to be concerning: **both** parallel agents exhibited the same drift pattern at the same point in their respective outputs. ## Environment - **Model**: Opus 4.7 (1M context), exact ID `claude-opus-4-7[1m]` - **CLI**: Claude Code (Mac desktop / Kitty terminal) - **Date**: 2026-05-16 (JST) - **Session ID**: ` ` - **Parent prompt language**: 100% Japanese (chat, instructions, agent prompts) - **Sub-agent type**: `Explore` (read-only research agent) - **Concurrency**: 2 agents launched in a single message (parallel) - **Input files**: ~1,500 lines of Japanese Markdown user documentation, verified Korean-character-free (`grep -rE '[가-힣]'` returned zero hits across all inputs) ## What happened - Both agents produced 200–400 line Markdown reports as expected - The first 60–70% of each report is clean Japanese - From a certain point (different per report, but roughly mid-document), Korean words start appearing inside Japanese sentences - The Korean tokens are direct equivalents of the Japanese the agent was clearly trying to write (e.g., expected `状態遷移`, produced `상태遷移` — same word, but the first half is Korean Hangul) - Section headers, paragraph bodies, and Mermaid `note over` strings all affected - Code blocks (Mermaid diagrams, file paths, code snippets) were unaffected ## Sample Expected: ``` ### [状態遷移] Spec 承認後の Plan Mode 入りの自動化条件不明 draft v2 では R-S5「Bon が Spec を OK と明示するまで Plan Mode に進まない」とあるが… ``` Actually produced: ``` ### [상태遷移] Spec 승인 후 Plan Mode 入りの自動化条件不明 draft v2 에서は R-S5 「Bon が Spec を OK と明示するまで Plan Mode に進まない」とあるが、「Bon의 OK 발화」後… ``` (Full outputs available on request — local Markdown files retained.) ## Why I am reporting this 1. **Reproducibility**: Two parallel agents showing the same drift pattern (Japanese → Korean, late in output, semantically equivalent) is unlikely to be random sampling noise 2. **Multilingual integrity concern**: A user instructed in language A should not silently receive output in language B — workflow risk if not caught (uploaded to shared docs, read by teammates, etc.) 3. **Security adjacent**: Although the produced Korean text was semantically benign (translations, not injection), the *mechanism* — long-output drift into a typographically-adjacent language — is a class of bug that could in principle be exploited if attacker-controlled text appeared in the drift target 4. **No input contamination**: Verified all input files contain zero Korean characters before agent invocation, so the drift originated from the model itself ## Hypotheses (user-side speculation only) - Japanese and Korean tokens occupy nearby regions in the model's embedding space; long-form generation with attention bias toward the later half of the output drifts across the boundary - Two parallel agents with identical model + similar input + similar task length triggered similar drift trajectories - Possible interaction with the `1M context` variant — uncertain ## What I am not claiming - Not claiming malice or prompt injection — input files are confirmed Korean-free - Not claiming all multilingual output is broken — earlier sessions same day were clean - Not claiming reliable on-demand reproduction — single occurrence so far, but multi-agent simultaneous occurrence makes it worth reporting ## Asks 1. Acknowledge if this is a known pattern with Opus 4.7 (1M) 2

claude-code2026-05-16 03:59:33

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Two parallel Explore sub-agents (both spawned from the same parent Opus 4.7 (1M) main session, same instant, with prompts written entirely in Japanese) produced long-form Markdown reports where the second half of the output drifted from Japanese into Korean, with words and sentences mixed in a single line (e.g., 상태遷移, 자동으로, 부작용). The Korean text is semantically the translation of the Japanese text the agent intended to write — not random or malicious content.

This is reproducible-enough to be concerning: both parallel agents exhibited the same drift pattern at the same point in their respective outputs.

Root Cause

This is reproducible-enough to be concerning: both parallel agents exhibited the same drift pattern at the same point in their respective outputs.

Fix Action

Fix / Workaround

Acknowledge if this is a known pattern with Opus 4.7 (1M)
If not known: any guidance on how to capture more diagnostic data if it reoccurs (full raw response, tokenizer trace, etc.)?
Any workaround beyond "shorter outputs" (e.g., explicit language lock in agent prompt, model parameter)?

Code Example

### [状態遷移] Spec 承認後の Plan Mode 入りの自動化条件不明

draft v2 では R-S5「Bon が Spec を OK と明示するまで Plan Mode に進まない」とあるが…

---

### [상태遷移] Spec 승인 후 Plan Mode 入り の自動化 条件不明

draft v2 에서は R-S5 「Bon が Spec を OK と明示するまで Plan Mode に進まない」とあるが、「Bon의 OK 발화」後…

RAW_BUFFERClick to expand / collapse

Summary

This is reproducible-enough to be concerning: both parallel agents exhibited the same drift pattern at the same point in their respective outputs.

Environment

Model: Opus 4.7 (1M context), exact ID claude-opus-4-7[1m]
CLI: Claude Code (Mac desktop / Kitty terminal)
Date: 2026-05-16 (JST)
Session ID: <redacted; available on request via private channel>
Parent prompt language: 100% Japanese (chat, instructions, agent prompts)
Sub-agent type: Explore (read-only research agent)
Concurrency: 2 agents launched in a single message (parallel)
Input files: ~1,500 lines of Japanese Markdown user documentation, verified Korean-character-free (grep -rE '[가-힣]' returned zero hits across all inputs)

What happened

Both agents produced 200–400 line Markdown reports as expected
The first 60–70% of each report is clean Japanese
From a certain point (different per report, but roughly mid-document), Korean words start appearing inside Japanese sentences
The Korean tokens are direct equivalents of the Japanese the agent was clearly trying to write (e.g., expected 状態遷移, produced 상태遷移 — same word, but the first half is Korean Hangul)
Section headers, paragraph bodies, and Mermaid note over strings all affected
Code blocks (Mermaid diagrams, file paths, code snippets) were unaffected

Sample

Expected:

### [状態遷移] Spec 承認後の Plan Mode 入りの自動化条件不明

draft v2 では R-S5「Bon が Spec を OK と明示するまで Plan Mode に進まない」とあるが…

Actually produced:

### [상태遷移] Spec 승인 후 Plan Mode 入り の自動化 条件不明

draft v2 에서は R-S5 「Bon が Spec を OK と明示するまで Plan Mode に進まない」とあるが、「Bon의 OK 발화」後…

(Full outputs available on request — local Markdown files retained.)

Why I am reporting this

Reproducibility: Two parallel agents showing the same drift pattern (Japanese → Korean, late in output, semantically equivalent) is unlikely to be random sampling noise
Multilingual integrity concern: A user instructed in language A should not silently receive output in language B — workflow risk if not caught (uploaded to shared docs, read by teammates, etc.)
Security adjacent: Although the produced Korean text was semantically benign (translations, not injection), the mechanism — long-output drift into a typographically-adjacent language — is a class of bug that could in principle be exploited if attacker-controlled text appeared in the drift target
No input contamination: Verified all input files contain zero Korean characters before agent invocation, so the drift originated from the model itself

Hypotheses (user-side speculation only)

Japanese and Korean tokens occupy nearby regions in the model's embedding space; long-form generation with attention bias toward the later half of the output drifts across the boundary
Two parallel agents with identical model + similar input + similar task length triggered similar drift trajectories
Possible interaction with the 1M context variant — uncertain

What I am not claiming

Not claiming malice or prompt injection — input files are confirmed Korean-free
Not claiming all multilingual output is broken — earlier sessions same day were clean
Not claiming reliable on-demand reproduction — single occurrence so far, but multi-agent simultaneous occurrence makes it worth reporting

Asks

Acknowledge if this is a known pattern with Opus 4.7 (1M)
If not known: any guidance on how to capture more diagnostic data if it reoccurs (full raw response, tokenizer trace, etc.)?
Any workaround beyond "shorter outputs" (e.g., explicit language lock in agent prompt, model parameter)?

I am happy to share the two output Markdown files privately if useful.

Thanks.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#training loop #device allocation #model download #tokenizer error #prompt formatting

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Opus 4.7 (1M): Parallel Explore sub-agents drift from Japanese to Korean in long Markdown output

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Environment

What happened

Sample

Why I am reporting this

Hypotheses (user-side speculation only)

What I am not claiming

Asks

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Opus 4.7 (1M): Parallel Explore sub-agents drift from Japanese to Korean in long Markdown output

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Environment

What happened

Sample

Why I am reporting this

Hypotheses (user-side speculation only)

What I am not claiming

Asks

Still need to ship something?

RELATED_DISCOVERY

TRENDING