hermes - ✅(Solved) Fix Bug: Feishu messages truncated prematurely due to len() counting UTF-8 instead of UTF-16 [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#11589Fetched 2026-04-18 06:00:05
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
cross-referenced ×2referenced ×2commented ×1

Error Message

  • Long system prompts or error messages get silently truncated

Root Cause

The truncate_message() function in the Feishu adapter uses Python's len() to measure string length, which counts characters (code points). However, the Feishu API limits message content to UTF-16 code units, where Chinese characters count as 2 units each.

  • Python len("你好") = 2 (characters)
  • Feishu UTF-16 limit: 4000 units
  • So Python allows 4000 Chinese characters, but Feishu only accepts ~2000

The fix is to use UTF-16 length instead:

def utf16_len(s: str) -> int:
    return len(s.encode('utf-16-le')) // 2

def truncate_message(content: str, max_utf16: int = 4000) -> str:
    if utf16_len(content) <= max_utf16:
        return content
    # ... truncate to max_utf16

Then replace all len(content) calls with utf16_len(content) in the Feishu adapter's truncate logic.

Fix Action

Fixed

PR fix notes

PR #11644: fix: use utf16_len for Feishu message truncation to fix Chinese character counting

Description (problem / solution / changelog)

Summary

Fixes premature message truncation for Feishu messages containing Chinese and other multi-byte Unicode characters.

Root Cause

The Feishu API limits message content to UTF-16 code units (max 4000 for post messages, 8000 for text), where Chinese characters count as 2 units each. However, the Feishu adapter was using Python's len() which counts Unicode code points, not UTF-16 units. This caused messages with ~2000 Chinese characters to be truncated or split incorrectly.

Fix

  • Import utf16_len from gateway.platforms.base (already used by Telegram adapter)
  • Pass len_fn=utf16_len to truncate_message() in the Feishu adapter's send() method

Test Plan

  • pytest tests/gateway/test_feishu.py -q -k send passes (18 tests)
  • The fix follows the same pattern used by the Telegram adapter

Closes NousResearch/hermes-agent#11589

Changed files

  • gateway/platforms/feishu.py (modified, +2/-1)

PR #11827: fix(feishu): use utf16_len for message truncation

Description (problem / solution / changelog)

Summary

Feishu's API enforces an 8000 UTF-16 code unit limit on messages, but FeishuAdapter was passing Python's len() (byte/character count) to truncate_message, which undercounts CJK and emoji characters and causes premature truncation.

Changes

  • gateway/platforms/feishu.py: import utf16_len from gateway.platforms.base and pass len_fn=utf16_len to truncate_message, matching the existing pattern in telegram.py

Testing

Messages with Chinese characters and emoji now split at the correct UTF-16 boundary.

Closes #11589

Changed files

  • gateway/platforms/feishu.py (modified, +2/-1)
  • scripts/release.py (modified, +1/-0)

Code Example

def utf16_len(s: str) -> int:
    return len(s.encode('utf-16-le')) // 2

def truncate_message(content: str, max_utf16: int = 4000) -> str:
    if utf16_len(content) <= max_utf16:
        return content
    # ... truncate to max_utf16
RAW_BUFFERClick to expand / collapse

Bug Description

Feishu messages get truncated prematurely when they contain Chinese (or other multi-byte Unicode) characters. A message of ~2000 Chinese characters gets split into multiple messages or cut off entirely.

Root Cause

The truncate_message() function in the Feishu adapter uses Python's len() to measure string length, which counts characters (code points). However, the Feishu API limits message content to UTF-16 code units, where Chinese characters count as 2 units each.

  • Python len("你好") = 2 (characters)
  • Feishu UTF-16 limit: 4000 units
  • So Python allows 4000 Chinese characters, but Feishu only accepts ~2000

The fix is to use UTF-16 length instead:

def utf16_len(s: str) -> int:
    return len(s.encode('utf-16-le')) // 2

def truncate_message(content: str, max_utf16: int = 4000) -> str:
    if utf16_len(content) <= max_utf16:
        return content
    # ... truncate to max_utf16

Then replace all len(content) calls with utf16_len(content) in the Feishu adapter's truncate logic.

Impact

  • Chinese users cannot send/receive messages longer than ~2000 characters
  • Long system prompts or error messages get silently truncated
  • Multi-language content (Emoji + Chinese + Korean) gets double-counted

extent analysis

TL;DR

To fix the premature truncation of Feishu messages containing Chinese characters, replace the len() function with a custom utf16_len() function that accurately measures the string length in UTF-16 code units.

Guidance

  • Identify all occurrences of len() in the Feishu adapter's truncate logic and replace them with utf16_len() to ensure accurate measurement of string length.
  • Verify that the utf16_len() function is correctly implemented to measure the length of strings in UTF-16 code units.
  • Test the updated truncate_message() function with different types of input, including Chinese characters, emojis, and multi-language content, to ensure it correctly truncates messages without premature cutting.
  • Consider adding additional logging or monitoring to detect and report any further issues with message truncation.

Example

def utf16_len(s: str) -> int:
    return len(s.encode('utf-16-le')) // 2

def truncate_message(content: str, max_utf16: int = 4000) -> str:
    if utf16_len(content) <= max_utf16:
        return content
    # ... truncate to max_utf16

Notes

This fix assumes that the Feishu API limit is indeed based on UTF-16 code units. If the API uses a different encoding or limit, additional adjustments may be necessary.

Recommendation

Apply the workaround by replacing len() with utf16_len() in the Feishu adapter's truncate logic, as this directly addresses the root cause of the issue and ensures accurate measurement of string length in UTF-16 code units.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING