hermes - 💡(How to fix) Fix [Bug]: Streaming chunks lost when _throttle_calls skips intermediate edits during fast token generation

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

No error or traceback — the bug is silent. This is a logic error in the stream consumer.

Root Cause

When streaming responses are generated rapidly, the tail portion of intermediate assistant messages is silently truncated. This occurs because the _throttle_calls decorator in GatewayStreamConsumer._send_or_edit() skips edits when they arrive faster than edit_interval, but the skipped content is never recovered when the stream finalizes. This is NOT the same as #10942 — that bug was about the entire final response being suppressed. This bug is about partial content loss within streaming edits. The issue manifests consistently when: Model generates text rapidly (many tokens per second) A tool call or segment break interrupts the stream The last partial edit before the break is truncated Example from actual usage: Expected: "让我深入检查一下你当前代码中相关的逻辑,确认" Actual: "让我深入..." → then cut off

Code Example

root@localhost:~# hermes version
Hermes Agent v0.13.0 (2026.5.7)
Project: /root/.hermes/hermes-agent
Python: 3.11.15
OpenAI SDK: 2.32.0

---

No error or traceback — the bug is silent. This is a logic error in the stream consumer.
RAW_BUFFERClick to expand / collapse

Bug Description

When streaming responses are generated rapidly, the tail portion of intermediate assistant messages is silently truncated. This occurs because the _throttle_calls decorator in GatewayStreamConsumer._send_or_edit() skips edits when they arrive faster than edit_interval, but the skipped content is never recovered when the stream finalizes. This is NOT the same as #10942 — that bug was about the entire final response being suppressed. This bug is about partial content loss within streaming edits. The issue manifests consistently when: Model generates text rapidly (many tokens per second) A tool call or segment break interrupts the stream The last partial edit before the break is truncated Example from actual usage: Expected: "让我深入检查一下你当前代码中相关的逻辑,确认" Actual: "让我深入..." → then cut off

Steps to Reproduce

Start any Hermes session (CLI or any gateway platform: Feishu, Telegram, etc.) Trigger a response that generates text quickly followed by a tool call, e.g.: "请帮我分析这个文件的内容[attach a large file]" or "搜索关于 xxx 的最新信息" Observe the streaming message — the text before the tool call will be truncated at the tail The truncated text appears as if the sentence was cut off mid-way Reproducibility: 100% when rapid generation meets tool boundary.

Expected Behavior

All accumulated text up to the point of a segment break or tool call should be preserved and delivered. If _throttle_calls skips an intermediate edit, the accumulated text should be flushed when the next mandatory edit (got_done, got_segment_break) fires.

Actual Behavior

The _accumulated buffer contains the full text, but during final flush, only a prefix is sent. The last few words/characters before the tool boundary are silently lost. From the user perspective: 让我深入... → [tool call happens] → 最终回复从这里开始 ^^^^^^^^^^^^ 这部分应该是完整的 "让我深入检查一下你当前代码中相关的逻辑,确认" No error is logged — the truncation is silent.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp)

Messaging Platform (if gateway-related)

Telegram, Discord, Slack, WhatsApp

Debug Report

root@localhost:~# hermes version
Hermes Agent v0.13.0 (2026.5.7)
Project: /root/.hermes/hermes-agent
Python: 3.11.15
OpenAI SDK: 2.32.0

Operating System

Ubuntu 24.04.4 LTS

Python Version

3.11.15

Hermes Version

v0.13.0

Additional Logs / Traceback (optional)

No error or traceback — the bug is silent. This is a logic error in the stream consumer.

Root Cause Analysis (optional)

File: gateway/stream_consumer.py Problem location: Lines 337-347 and the interaction with _throttle_calls

stream_consumer.py:337-347

now = time.monotonic()

elapsed = now - self._last_edit_time

should_edit = (

got_done

or got_segment_break

or commentary_text is not None

)

if not self.cfg.buffer_only:

should_edit = should_edit or (

    (elapsed >= self._current_edit_interval

        and self._accumulated)

    or len(self._accumulated) >= self.cfg.buffer_threshold

)

The bug: When elapsed < _current_edit_interval, edits are skipped. However: The _accumulated buffer continues to grow with new tokens When got_done or got_segment_break finally fires, the code enters _send_or_edit(self._accumulated) But by this time, the visible prefix (_last_sent_text) has diverged from _accumulated The final edit sends what it thinks is the "remaining content" but it actually truncates Additional evidence: The _throttle_calls decorator (used by Feishu adapter and potentially others) further compounds this by queuing/skipping calls when rate-limited, potentially losing the latest accumulated text.

Proposed Fix (optional)

Option A: Ensure forced edits always send the full _accumulated text, not a diff against _last_sent_text. Option B: When an edit is skipped due to throttling, mark _accumulated as "dirty" and guarantee a flush on the next should_edit=True cycle. Option C: Remove or relax _throttle_calls for the stream consumer path, since the consumer already has its own edit_interval throttling mechanism.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Bug]: Streaming chunks lost when _throttle_calls skips intermediate edits during fast token generation