hermes - ✅(Solved) Fix Gateway hygiene hard message cap counts tool rows, causing early compaction in tool-heavy Telegram sessions [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15195Fetched 2026-04-25 06:23:54
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
labeled ×4commented ×1cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #15439: fix(gateway): count only substantive messages for hard message cap, not tool plumbing rows (#15195)

Description (problem / solution / changelog)

Summary

Fixes #15195.

The gateway hygiene hard message cap uses len(history) to count messages, which includes tool-call wrappers and tool-result rows. A single user exchange with 10 tool calls produces 21+ raw rows. With _HARD_MSG_LIMIT = 400, a session with just 20 real user exchanges that use tools heavily can produce 400+ raw rows, triggering premature compaction and losing conversation context.

Changes

  • gateway/run.py: Replaced _msg_count = len(history) with a filtered count that only includes user messages and assistant messages with actual content. Tool-call-only assistant messages (no content) and tool role messages are excluded. Added _raw_row_count for logging. Updated the log message to show both substantive message count and raw row count for debugging clarity.

Validation

Tests: tests/gateway/test_session_hygiene.py — 5 new tests in TestHardMessageCapExcludesToolRows:

  • Tool-heavy session (30 user + 30 assistant + 340 tool = 400 raw) does NOT trigger cap
  • Substantive-heavy session (200 user + 200 assistant = 400 substantive) DOES trigger
  • Tool-call-only assistant messages are not counted
  • Empty-content user messages still count
  • Raw row count preserved for logging

Tested on macOS (Python 3.11).

Changed files

  • gateway/run.py (modified, +14/-3)
  • tests/gateway/test_session_hygiene.py (modified, +112/-0)

Code Example

_msg_count = len(history)
_HARD_MSG_LIMIT = 400
_needs_compress = (
    _approx_tokens >= _compress_token_threshold
    or _msg_count >= _HARD_MSG_LIMIT
)

---

reason=message_count raw_rows=403 tokens=312077 threshold=892500
RAW_BUFFERClick to expand / collapse

Question / possible bug

Gateway session hygiene has a hard message cap:

_msg_count = len(history)
_HARD_MSG_LIMIT = 400
_needs_compress = (
    _approx_tokens >= _compress_token_threshold
    or _msg_count >= _HARD_MSG_LIMIT
)

Is it intentional that _msg_count counts raw transcript rows, including tool result rows and empty assistant tool-call wrapper rows?

In tool-heavy Telegram sessions, this can trigger hygiene compression far below the token threshold.

Example

I saw hygiene compactions like:

  • 400 messages, ~258,155 tokens
  • 409 messages, ~443,475 tokens
  • 403 messages, ~312,077 tokens

while the logged threshold was:

  • 85% of 1,050,000 = 892,500 tokens

In one real session before compaction, the transcript composition was roughly:

  • 21 user messages
  • 19 visible assistant messages
  • 188 tool rows
  • 135 empty assistant/tool-call wrapper rows
  • 364 total raw rows

So the hard cap can be approached after ~20 actual user turns in a tool-heavy session.

Why I’m asking

I understand the 400 cap was added as a safety valve for #2153 / PR #4750 to prevent API-disconnect compression death spirals.

The question is whether the hard cap should count:

  • raw transcript rows, current behavior
  • only user + visible assistant messages
  • weighted/effective messages
  • or be configurable, e.g. gateway.session_hygiene.max_messages

At minimum, it might help if the log included the trigger reason:

reason=message_count raw_rows=403 tokens=312077 threshold=892500

Currently the log only shows the token threshold, which makes it look like compression fired despite being far below threshold.

extent analysis

TL;DR

The hard message cap in gateway session hygiene may be triggering compression too aggressively in tool-heavy sessions due to counting raw transcript rows, and a more nuanced approach to counting messages may be needed.

Guidance

  • Review the current implementation of _msg_count and consider alternative methods for counting messages, such as only counting user and visible assistant messages.
  • Evaluate the effectiveness of the current _HARD_MSG_LIMIT value and consider making it configurable, e.g., via a gateway.session_hygiene.max_messages setting.
  • Enhance logging to include the trigger reason for compression, such as the example log message provided, to improve visibility into when and why compression is occurring.
  • Consider implementing a weighted or effective message count to more accurately reflect the actual message volume in tool-heavy sessions.

Example

No code snippet is provided as the issue is more related to the logic and configuration of the gateway session hygiene rather than a specific code implementation.

Notes

The current implementation may lead to premature compression in tool-heavy sessions, and a more nuanced approach to counting messages may be necessary to prevent this. The exact solution will depend on the specific requirements and constraints of the system.

Recommendation

Apply a workaround by modifying the _msg_count calculation to only include user and visible assistant messages, or make the _HARD_MSG_LIMIT value configurable to allow for more fine-grained control over compression triggering. This will help prevent premature compression in tool-heavy sessions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix Gateway hygiene hard message cap counts tool rows, causing early compaction in tool-heavy Telegram sessions [1 pull requests, 1 comments, 2 participants]