openclaw - 💡(How to fix) Fix [Bug]: Feishu inbound JSON file_name CJK mojibake (distinct from Content-Disposition path)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

CJK filenames in inbound Feishu messages still produce mojibake when delivered via the JSON file_name field (distinct from the Content-Disposition path fixed in #72388); the media.test.ts suite contains an explicit test that asserts the broken behavior.

Root Cause

CJK filenames in inbound Feishu messages still produce mojibake when delivered via the JSON file_name field (distinct from the Content-Disposition path fixed in #72388); the media.test.ts suite contains an explicit test that asserts the broken behavior.

Fix Action

Fix / Workaround

# saved filename observed on disk (mojibake, pre-patch):
$ ls -la ~/.openclaw/media/inbound/ | tail -3
-rw------- 1 user staff 1613681 May 13 01:30 Agentå_ç_æ_å_å_¹è_-...---<uuid>.pdf
-rw------- 1 user staff   84406 May 13 01:41 ä¼_ä_å¾_ä_20260413-133035_2x---<uuid>.png
-rw------- 1 user staff  391395 May 13 01:31 æ_æ_codexæ_å_ç_ç_å_½ä_æ_ç_æ_äº_ä_å¼_å_¾_...---<uuid>.jpg

# saved filename observed after applying a local recover patch:
$ ls -la ~/.openclaw/media/inbound/ | grep 01:59
-rw------- 1 user staff 650261 May 13 01:59 微信图片_20260509120820_848_2166---<uuid>.png

Local workaround currently in production: patching `dist/monitor-*.js` with marker `RnB-PATCH(2026-05-13)`. Verified working on 2026.4.5 — same `01:59` evidence above.

Code Example

it("keeps JSON-derived file_name metadata unchanged", async () => {
  const fileName = "武汉15座山登山信息汇总.csv";
  const latin1LookingFileName = Buffer.from(fileName, "utf8").toString("latin1");
  messageResourceGetMock.mockResolvedValueOnce({
    data: Buffer.from("fake-file-data"),
    file_name: latin1LookingFileName,
  });
  const result = await downloadMessageResourceFeishu({ ... });
  expect(result.fileName).toBe(latin1LookingFileName); // pinned mojibake
});

---

# saved filename observed on disk (mojibake, pre-patch):
$ ls -la ~/.openclaw/media/inbound/ | tail -3
-rw------- 1 user staff 1613681 May 13 01:30 Agentå_ç_æ_å_å_¹è_-...---<uuid>.pdf
-rw------- 1 user staff   84406 May 13 01:41 ä¼_ä_å¾_ä_20260413-133035_2x---<uuid>.png
-rw------- 1 user staff  391395 May 13 01:31 æ_æ_codexæ_å_ç_ç_å_½ä_æ_ç_æ_äº_ä_å¼_å_¾_...---<uuid>.jpg

# saved filename observed after applying a local recover patch:
$ ls -la ~/.openclaw/media/inbound/ | grep 01:59
-rw------- 1 user staff 650261 May 13 01:59 微信图片_20260509120820_848_2166---<uuid>.png

# upstream main code paths (commit 6861d8a6):
# 1. extensions/feishu/src/bot-content.ts:312 — parseMediaKeys returns raw parsed.file_name
# 2. extensions/feishu/src/bot.ts (inbound resolveFeishuMediaList) — feeds to saveMediaBuffer with no recover
# 3. extensions/feishu/src/media.ts:183 — recoverUtf8FileNameFromLatin1Header exists but only used at line 203 (Content-Disposition path)
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

CJK filenames in inbound Feishu messages still produce mojibake when delivered via the JSON file_name field (distinct from the Content-Disposition path fixed in #72388); the media.test.ts suite contains an explicit test that asserts the broken behavior.

Steps to reproduce

  1. Run any OpenClaw build with the Feishu channel configured (verified on installed 2026.4.5; behavior also present on current main, commit 6861d8a6, by inspection).
  2. From the Feishu app, send a file whose filename contains CJK characters (e.g. 武汉15座山登山信息汇总.csv or 何不同舟渡_2.txt) to a bot bridged through OpenClaw.
  3. Inspect the saved file under ~/.openclaw/media/inbound/.

Expected behavior

The saved filename preserves the original CJK characters, matching the behavior already guaranteed by recoverUtf8FileNameFromLatin1Header for the Content-Disposition path after #72388. A clean Latin-1 filename such as café-©.txt continues to be preserved unchanged (the helper rejects when recovery would produce U+FFFD and requires the recovered string to contain East Asian script).

Actual behavior

The file lands on disk with mojibake (UTF-8 bytes interpreted as Latin-1), e.g. 武汉15座山登山信息汇总.csvæ­¦æ±15座山ç»å±±ä¿¡æ¯æ±æ»¥.csv. The JSON path never reaches the existing recover helper.

extensions/feishu/src/media.test.ts:862 (on main) actively asserts this broken behavior:

it("keeps JSON-derived file_name metadata unchanged", async () => {
  const fileName = "武汉15座山登山信息汇总.csv";
  const latin1LookingFileName = Buffer.from(fileName, "utf8").toString("latin1");
  messageResourceGetMock.mockResolvedValueOnce({
    data: Buffer.from("fake-file-data"),
    file_name: latin1LookingFileName,
  });
  const result = await downloadMessageResourceFeishu({ ... });
  expect(result.fileName).toBe(latin1LookingFileName); // pinned mojibake
});

OpenClaw version

2026.4.5 (also verified by source inspection on main commit 6861d8a6)

Operating system

macOS 15 (Darwin 25.2.0)

Install method

npm global

Model

minimax-portal/MiniMax-M2.7

Provider / routing chain

openclaw -> minimax-portal

Additional provider/model setup details

Not relevant — this is a filename-encoding bug in the inbound media pipeline; no model traffic is involved.

Logs, screenshots, and evidence

# saved filename observed on disk (mojibake, pre-patch):
$ ls -la ~/.openclaw/media/inbound/ | tail -3
-rw------- 1 user staff 1613681 May 13 01:30 Agentå_ç_æ_å_å_¹è_-...---<uuid>.pdf
-rw------- 1 user staff   84406 May 13 01:41 ä¼_ä_å¾_ä_20260413-133035_2x---<uuid>.png
-rw------- 1 user staff  391395 May 13 01:31 æ_æ_codexæ_å_ç_ç_å_½ä_æ_ç_æ_äº_ä_å¼_å_¾_...---<uuid>.jpg

# saved filename observed after applying a local recover patch:
$ ls -la ~/.openclaw/media/inbound/ | grep 01:59
-rw------- 1 user staff 650261 May 13 01:59 微信图片_20260509120820_848_2166---<uuid>.png

# upstream main code paths (commit 6861d8a6):
# 1. extensions/feishu/src/bot-content.ts:312 — parseMediaKeys returns raw parsed.file_name
# 2. extensions/feishu/src/bot.ts (inbound resolveFeishuMediaList) — feeds to saveMediaBuffer with no recover
# 3. extensions/feishu/src/media.ts:183 — recoverUtf8FileNameFromLatin1Header exists but only used at line 203 (Content-Disposition path)

Impact and severity

Affected: every Feishu user uploading files with CJK / Hiragana / Katakana / Hangul filenames to OpenClaw-bridged bots. In our deployment (an enterprise multi-agent system) this is ~30% of inbound files. Severity: Medium-High (files save successfully but become hard to identify, audit, and search by name). Frequency: 100% reproducible for any non-ASCII CJK filename. Consequence: User-facing filenames are unreadable; downstream Skills that key off filename (e.g. routing, knowledge-base ingestion) misbehave; users have to rename manually.

Additional information

This is filed per @vincentkoc's invite on #48388 ("If this still reproduces on current main with a different path, reply here and we can reopen or split it back out") — the original issue is now locked so opening a separate ticket.

Proposed fix scope (~10 LOC + test rewrite) — happy to send a PR if scope is confirmed:

  1. Apply recoverUtf8FileNameFromLatin1Header (possibly renamed to drop Header since it no longer applies only to headers) to parsed.file_name in parseMediaKeys — or once at the saveMediaBuffer call site, which catches both result.fileName and mediaKeys.fileName in one place.
  2. Rewrite media.test.ts:862 to assert the corrected behavior, plus add a guard case showing café-©.txt is still preserved unchanged.

Local workaround currently in production: patching dist/monitor-*.js with marker RnB-PATCH(2026-05-13). Verified working on 2026.4.5 — same 01:59 evidence above.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The saved filename preserves the original CJK characters, matching the behavior already guaranteed by recoverUtf8FileNameFromLatin1Header for the Content-Disposition path after #72388. A clean Latin-1 filename such as café-©.txt continues to be preserved unchanged (the helper rejects when recovery would produce U+FFFD and requires the recovered string to contain East Asian script).

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING