Root Cause

When streaming responses are generated rapidly, the tail portion of intermediate assistant messages is silently truncated. This occurs because the _throttle_calls decorator in GatewayStreamConsumer._send_or_edit() skips edits when they arrive faster than edit_interval, but the skipped content is never recovered when the stream finalizes. This is NOT the same as #10942 — that bug was about the entire final response being suppressed. This bug is about partial content loss within streaming edits. The issue manifests consistently when: Model generates text rapidly (many tokens per second) A tool call or segment break interrupts the stream The last partial edit before the break is truncated Example from actual usage: Expected: "让我深入检查一下你当前代码中相关的逻辑，确认" Actual: "让我深入..." → then cut off

root@localhost:~# hermes version Hermes Agent v0.13.0 (2026.5.7) Project: /root/.hermes/hermes-agent Python: 3.11.15 OpenAI SDK: 2.32.0 --- No error or traceback — the bug is silent. This is a logic error in the stream consumer.

Bug Description

Steps to Reproduce

Start any Hermes session (CLI or any gateway platform: Feishu, Telegram, etc.) Trigger a response that generates text quickly followed by a tool call, e.g.: "请帮我分析这个文件的内容[attach a large file]" or "搜索关于 xxx 的最新信息" Observe the streaming message — the text before the tool call will be truncated at the tail The truncated text appears as if the sentence was cut off mid-way Reproducibility: 100% when rapid generation meets tool boundary.

Expected Behavior

All accumulated text up to the point of a segment break or tool call should be preserved and delivered. If _throttle_calls skips an intermediate edit, the accumulated text should be flushed when the next mandatory edit (got_done, got_segment_break) fires.

Actual Behavior

The _accumulated buffer contains the full text, but during final flush, only a prefix is sent. The last few words/characters before the tool boundary are silently lost. From the user perspective: 让我深入... → [tool call happens] → 最终回复从这里开始 ^^^^^^^^^^^^ 这部分应该是完整的 "让我深入检查一下你当前代码中相关的逻辑，确认" No error is logged — the truncation is silent.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp)

Messaging Platform (if gateway-related)

Telegram, Discord, Slack, WhatsApp

Debug Report

root@localhost:~# hermes version
Hermes Agent v0.13.0 (2026.5.7)
Project: /root/.hermes/hermes-agent
Python: 3.11.15
OpenAI SDK: 2.32.0

Operating System

Ubuntu 24.04.4 LTS

Python Version

3.11.15

Hermes Version

v0.13.0

Additional Logs / Traceback (optional)

No error or traceback — the bug is silent. This is a logic error in the stream consumer.

Root Cause Analysis (optional)

File: gateway/stream_consumer.py Problem location: Lines 337-347 and the interaction with _throttle_calls

stream_consumer.py:337-347

now = time.monotonic()

elapsed = now - self._last_edit_time

should_edit = (

got_done

or got_segment_break

or commentary_text is not None

)

if not self.cfg.buffer_only:

should_edit = should_edit or (

    (elapsed >= self._current_edit_interval

        and self._accumulated)

    or len(self._accumulated) >= self.cfg.buffer_threshold

)

The bug: When elapsed < _current_edit_interval, edits are skipped. However: The _accumulated buffer continues to grow with new tokens When got_done or got_segment_break finally fires, the code enters _send_or_edit(self._accumulated) But by this time, the visible prefix (_last_sent_text) has diverged from _accumulated The final edit sends what it thinks is the "remaining content" but it actually truncates Additional evidence: The _throttle_calls decorator (used by Feishu adapter and potentially others) further compounds this by queuing/skipping calls when rate-limited, potentially losing the latest accumulated text.

Proposed Fix (optional)

Option A: Ensure forced edits always send the full _accumulated text, not a diff against _last_sent_text. Option B: When an edit is skipped due to throttling, mark _accumulated as "dirty" and guarantee a flush on the next should_edit=True cycle. Option C: Remove or relax _throttle_calls for the stream consumer path, since the consumer already has its own edit_interval throttling mechanism.

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: Streaming chunks lost when _throttle_calls skips intermediate edits during fast token generation

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

stream_consumer.py:337-347

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Bug]: Streaming chunks lost when _throttle_calls skips intermediate edits during fast token generation

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

stream_consumer.py:337-347

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

RELATED_DISCOVERY

TRENDING