hermes - ✅(Solved) Fix Bug: Feishu messages truncated prematurely due to len() counting UTF-8 instead of UTF-16 [2 pull requests, 1 comments, 2 participants]

haoqimeng1992 · 2026-04-17T12:25:59Z

[hermes] PR 11644: fix: use utf16 len for Feishu message truncation to fix Chinese character counting - Repository: NousResearch/hermes-agent - Author: nightq… # PR #11644: fix: use utf16_len for Feishu message truncation to fix Chinese character counting - Repository: NousResearch/hermes-agent - Author: nightq - State: open | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/11644 ## Description (problem / solution / changelog) ## Summary Fixes premature message truncation for Feishu messages containing Chinese and other multi-byte Unicode characters. ## Root Cause The Feishu API limits message content to UTF-16 code units (max 4000 for post messages, 8000 for text), where Chinese characters count as 2 units each. However, the Feishu adapter was using Python's `len()` which counts Unicode code points, not UTF-16 units. This caused messages with ~2000 Chinese characters to be truncated or split incorrectly. ## Fix - Import `utf16_len` from `gateway.platforms.base` (already used by Telegram adapter) - Pass `len_fn=utf16_len` to `truncate_message()` in the Feishu adapter's `send()` method ## Test Plan - [x] `pytest tests/gateway/test_feishu.py -q -k send` passes (18 tests) - [x] The fix follows the same pattern used by the Telegram adapter Closes NousResearch/hermes-agent#11589 ## Changed files - `gateway/platforms/feishu.py` (modified, +2/-1) --- # PR #11827: fix(feishu): use utf16_len for message truncation - Repository: NousResearch/hermes-agent - Author: Mibayy - State: open | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/11827 ## Description (problem / solution / changelog) ## Summary Feishu's API enforces an 8000 UTF-16 code unit limit on messages, but `FeishuAdapter` was passing Python's `len()` (byte/character count) to `truncate_message`, which undercounts CJK and emoji characters and causes premature truncation. ## Changes - `gateway/platforms/feishu.py`: import `utf16_len` from `gateway.platforms.base` and pass `len_fn=utf16_len` to `truncate_message`, matching the existing pattern in `telegram.py` ## Testing Messages with Chinese characters and emoji now split at the correct UTF-16 boundary. Closes #11589 ## Changed files - `gateway/platforms/feishu.py` (modified, +2/-1) - `scripts/release.py` (modified, +1/-0) ## Fixed - Fixed by PR: fix: use utf16_len for Feishu message truncation to fix Chinese character counting (https://github.com/NousResearch/hermes-agent/pull/11644) - Fixed by PR: fix(feishu): use utf16_len for message truncation (https://github.com/NousResearch/hermes-agent/pull/11827) ## Bug Description Feishu messages get truncated prematurely when they contain Chinese (or other multi-byte Unicode) characters. A message of ~2000 Chinese characters gets split into multiple messages or cut off entirely. ## Root Cause The `truncate_message()` function in the Feishu adapter uses Python's `len()` to measure string length, which counts **characters** (code points). However, the Feishu API limits message content to **UTF-16 code units**, where Chinese characters count as 2 units each. - Python `len("你好")` = 2 (characters) - Feishu UTF-16 limit: 4000 units - So Python allows 4000 Chinese characters, but Feishu only accepts ~2000 The fix is to use UTF-16 length instead: ```python def utf16_len(s: str) -> int: return len(s.encode('utf-16-le')) // 2 def truncate_message(content: str, max_utf16: int = 4000) -> str: if utf16_len(content) <= max_utf16: return content # ... truncate to max_utf16 ``` Then replace all `len(content)` calls with `utf16_len(content)` in the Feishu adapter's truncate logic. ## Impact - Chinese users cannot send/receive messages longer than ~2000 characters - Long system prompts or error messages get silently truncated - Multi-language content (Emoji + Chinese + Korean) gets double-counted

hermes2026-04-17 12:25:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#11589•Fetched 2026-04-18 06:00:05

View on GitHub

Comments

Participants

Timeline

Reactions

Author

haoqimeng1992

Participants

haoqimeng1992

hclsys

Timeline (top)

cross-referenced ×2referenced ×2commented ×1

Error Message

Long system prompts or error messages get silently truncated

Root Cause

The truncate_message() function in the Feishu adapter uses Python's len() to measure string length, which counts characters (code points). However, the Feishu API limits message content to UTF-16 code units, where Chinese characters count as 2 units each.

Python len("你好") = 2 (characters)
Feishu UTF-16 limit: 4000 units
So Python allows 4000 Chinese characters, but Feishu only accepts ~2000

The fix is to use UTF-16 length instead:

def utf16_len(s: str) -> int:
    return len(s.encode('utf-16-le')) // 2

def truncate_message(content: str, max_utf16: int = 4000) -> str:
    if utf16_len(content) <= max_utf16:
        return content
    # ... truncate to max_utf16

Then replace all len(content) calls with utf16_len(content) in the Feishu adapter's truncate logic.

Fix Action

Fixed

Fixed by PR: fix: use utf16_len for Feishu message truncation to fix Chinese character counting (https://github.com/NousResearch/hermes-agent/pull/11644)
Fixed by PR: fix(feishu): use utf16_len for message truncation (https://github.com/NousResearch/hermes-agent/pull/11827)

PR fix notes

PR #11644: fix: use utf16_len for Feishu message truncation to fix Chinese character counting

Repository: NousResearch/hermes-agent
Author: nightq
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/11644

Description (problem / solution / changelog)

Summary

Fixes premature message truncation for Feishu messages containing Chinese and other multi-byte Unicode characters.

Root Cause

The Feishu API limits message content to UTF-16 code units (max 4000 for post messages, 8000 for text), where Chinese characters count as 2 units each. However, the Feishu adapter was using Python's len() which counts Unicode code points, not UTF-16 units. This caused messages with ~2000 Chinese characters to be truncated or split incorrectly.

Fix

Import utf16_len from gateway.platforms.base (already used by Telegram adapter)
Pass len_fn=utf16_len to truncate_message() in the Feishu adapter's send() method

Test Plan

pytest tests/gateway/test_feishu.py -q -k send passes (18 tests)
The fix follows the same pattern used by the Telegram adapter

Closes NousResearch/hermes-agent#11589

Changed files

gateway/platforms/feishu.py (modified, +2/-1)

PR #11827: fix(feishu): use utf16_len for message truncation

Repository: NousResearch/hermes-agent
Author: Mibayy
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/11827

Description (problem / solution / changelog)

Summary

Feishu's API enforces an 8000 UTF-16 code unit limit on messages, but FeishuAdapter was passing Python's len() (byte/character count) to truncate_message, which undercounts CJK and emoji characters and causes premature truncation.

Changes

gateway/platforms/feishu.py: import utf16_len from gateway.platforms.base and pass len_fn=utf16_len to truncate_message, matching the existing pattern in telegram.py

Testing

Messages with Chinese characters and emoji now split at the correct UTF-16 boundary.

Closes #11589

Changed files

gateway/platforms/feishu.py (modified, +2/-1)
scripts/release.py (modified, +1/-0)

Code Example

def utf16_len(s: str) -> int:
    return len(s.encode('utf-16-le')) // 2

def truncate_message(content: str, max_utf16: int = 4000) -> str:
    if utf16_len(content) <= max_utf16:
        return content
    # ... truncate to max_utf16

RAW_BUFFERClick to expand / collapse

Bug Description

Feishu messages get truncated prematurely when they contain Chinese (or other multi-byte Unicode) characters. A message of ~2000 Chinese characters gets split into multiple messages or cut off entirely.

Root Cause

Python len("你好") = 2 (characters)
Feishu UTF-16 limit: 4000 units
So Python allows 4000 Chinese characters, but Feishu only accepts ~2000

The fix is to use UTF-16 length instead:

def utf16_len(s: str) -> int:
    return len(s.encode('utf-16-le')) // 2

def truncate_message(content: str, max_utf16: int = 4000) -> str:
    if utf16_len(content) <= max_utf16:
        return content
    # ... truncate to max_utf16

Then replace all len(content) calls with utf16_len(content) in the Feishu adapter's truncate logic.

Impact

Chinese users cannot send/receive messages longer than ~2000 characters
Long system prompts or error messages get silently truncated
Multi-language content (Emoji + Chinese + Korean) gets double-counted

extent analysis

TL;DR

To fix the premature truncation of Feishu messages containing Chinese characters, replace the len() function with a custom utf16_len() function that accurately measures the string length in UTF-16 code units.

Guidance

Identify all occurrences of len() in the Feishu adapter's truncate logic and replace them with utf16_len() to ensure accurate measurement of string length.
Verify that the utf16_len() function is correctly implemented to measure the length of strings in UTF-16 code units.
Test the updated truncate_message() function with different types of input, including Chinese characters, emojis, and multi-language content, to ensure it correctly truncates messages without premature cutting.
Consider adding additional logging or monitoring to detect and report any further issues with message truncation.

Example

def utf16_len(s: str) -> int:
    return len(s.encode('utf-16-le')) // 2

def truncate_message(content: str, max_utf16: int = 4000) -> str:
    if utf16_len(content) <= max_utf16:
        return content
    # ... truncate to max_utf16

Notes

This fix assumes that the Feishu API limit is indeed based on UTF-16 code units. If the API uses a different encoding or limit, additional adjustments may be necessary.

Recommendation

Apply the workaround by replacing len() with utf16_len() in the Feishu adapter's truncate logic, as this directly addresses the root cause of the issue and ensures accurate measurement of string length in UTF-16 code units.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #network issue #logging issue #authentication issue #prompt issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix Bug: Feishu messages truncated prematurely due to len() counting UTF-8 instead of UTF-16 [2 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #11644: fix: use utf16_len for Feishu message truncation to fix Chinese character counting

Description (problem / solution / changelog)

Summary

Root Cause

Fix

Test Plan

Changed files

PR #11827: fix(feishu): use utf16_len for message truncation

Description (problem / solution / changelog)

Summary

Changes

Testing

Changed files

Code Example

Bug Description

Root Cause

Impact

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING