hermes - ✅(Solved) Fix BUG: Anti-thrashing protection permanently disables auto-compression with no recovery [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14694Fetched 2026-04-24 06:15:12
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Participants
Timeline (top)
labeled ×3cross-referenced ×1

Root Cause

In agent/context_compressor.py, the anti-thrashing counter only resets on effective compression or session reset:

# should_compress() — lines 417-427
if self._ineffective_compression_count >= 2:
    # ... warning log ...
    return False  # Permanently blocked, no recovery path

# compress() — lines 1267-1273
if savings_pct < 10:
    self._ineffective_compression_count += 1  # Only increases
else:
    self._ineffective_compression_count = 0   # Only resets on effective compression

# on_session_reset() — line 298
self._ineffective_compression_count = 0  # Only resets on /new

The gap: If two consecutive compressions are ineffective (e.g., the middle region has few messages because most are in the protected head/tail), the counter hits 2 and never decreases. Subsequent context growth is completely ignored.

Fix Action

Fix

Add a time-based auto-recovery: if enough time has passed since the last compression attempt, reset the counter. This preserves the anti-thrashing protection (preventing rapid-fire ineffective compressions) while allowing recovery when the conversation has grown significantly.

# In __init__:
self._last_compression_time: float = 0.0
self._ANTI_THRASH_RECOVERY_SECONDS: float = 300.0  # 5 minutes

# In should_compress():
if self._ineffective_compression_count >= 2:
    _elapsed = time.monotonic() - self._last_compression_time
    if _elapsed > self._ANTI_THRASH_RECOVERY_SECONDS:
        self._ineffective_compression_count = 0
        logger.info("Anti-thrashing reset: %.0fs since last compression attempt", _elapsed)
    else:
        return False

# In compress() — after updating savings:
self._last_compression_time = time.monotonic()

The 300-second (5-minute) recovery window is conservative enough to prevent thrashing while ensuring that a session isn't permanently locked out of compression.

PR fix notes

PR #14696: fix(compression): three bugs causing auto-compression to never trigger

Description (problem / solution / changelog)

Summary

Fixes three bugs in the context auto-compression system that collectively cause compression to never trigger for models with context_length at or near MINIMUM_CONTEXT_LENGTH (64000 tokens).

Bug 1: MINIMUM_CONTEXT_LENGTH floor makes threshold=100% when context_length==64000

Closes #14690

When context_length == MINIMUM_CONTEXT_LENGTH == 64000, the floor value in threshold_tokens calculation dominates:

# Before: max(44800, 64000) = 64000 = 100% of context → compression never triggers
self.threshold_tokens = max(
    int(self.context_length * threshold_percent),
    MINIMUM_CONTEXT_LENGTH,
)

Fix: Fall back to percentage-based value when floor >= context_length:

if self.threshold_tokens >= self.context_length:
    self.threshold_tokens = int(self.context_length * threshold_percent)

Applied in both __init__ and update_model.

Bug 2: Anti-thrashing protection permanently disables compression with no recovery

Closes #14694

After 2 consecutive ineffective compressions (<10% savings each), should_compress() returns False forever. No timeout, decay, or auto-recovery mechanism exists.

Fix: Add time-based auto-recovery (300 seconds). If enough time has passed since the last compression attempt, reset the counter:

if self._ineffective_compression_count >= 2:
    _elapsed = time.monotonic() - self._last_compression_time
    if _elapsed > self._ANTI_THRASH_RECOVERY_SECONDS:
        self._ineffective_compression_count = 0
    else:
        return False

Bug 3: Post-compression token estimate excludes tools schema

Closes #14695

After compression, last_prompt_tokens is set using estimate_messages_tokens_rough() which omits tools schema tokens (20-30K with 50+ tools). This causes the next compression cycle to trigger much later than the configured threshold.

Fix: Use estimate_request_tokens_rough() which includes tools schema, consistent with the preflight compression check pattern:

# Before:
_compressed_est = estimate_tokens_rough(new_system_prompt) + estimate_messages_tokens_rough(compressed)

# After:
_compressed_est = estimate_request_tokens_rough(
    compressed, system_prompt=new_system_prompt or "", tools=self.tools or None,
)

Testing

Verified with unit-level tests:

  • Bug 1: context_length=64000, threshold=0.7threshold_tokens=44800 (70%), should_compress(44800)=True
  • Bug 2: Anti-thrashing blocks within 300s window, auto-recovers after 300s elapsed
  • Bug 3: estimate_request_tokens_rough includes tools schema in token count

Files Changed

  • agent/context_compressor.py: Bug 1 fix (L320-321, L363-368) + Bug 2 fix (L299, L398-401, L418-436, L1283)
  • run_agent.py: Bug 3 fix (L7596-7607)

Changed files

  • agent/context_compressor.py (modified, +34/-8)
  • run_agent.py (modified, +8/-3)

Code Example

# should_compress() — lines 417-427
if self._ineffective_compression_count >= 2:
    # ... warning log ...
    return False  # Permanently blocked, no recovery path

# compress() — lines 1267-1273
if savings_pct < 10:
    self._ineffective_compression_count += 1  # Only increases
else:
    self._ineffective_compression_count = 0   # Only resets on effective compression

# on_session_reset() — line 298
self._ineffective_compression_count = 0  # Only resets on /new

---

# In __init__:
self._last_compression_time: float = 0.0
self._ANTI_THRASH_RECOVERY_SECONDS: float = 300.0  # 5 minutes

# In should_compress():
if self._ineffective_compression_count >= 2:
    _elapsed = time.monotonic() - self._last_compression_time
    if _elapsed > self._ANTI_THRASH_RECOVERY_SECONDS:
        self._ineffective_compression_count = 0
        logger.info("Anti-thrashing reset: %.0fs since last compression attempt", _elapsed)
    else:
        return False

# In compress() — after updating savings:
self._last_compression_time = time.monotonic()
RAW_BUFFERClick to expand / collapse

Bug Description

When the anti-thrashing protection triggers (2 consecutive compressions saving <10% each), should_compress() permanently returns False for the rest of the session. There is no timeout, decay, or auto-recovery mechanism — the only way to restore auto-compression is to manually run /new or /reset.

This means a session that had two ineffective compressions early on will never auto-compress again, even as the context grows far beyond the configured threshold, eventually hitting the model's context limit and getting forcefully degraded.

Root Cause

In agent/context_compressor.py, the anti-thrashing counter only resets on effective compression or session reset:

# should_compress() — lines 417-427
if self._ineffective_compression_count >= 2:
    # ... warning log ...
    return False  # Permanently blocked, no recovery path

# compress() — lines 1267-1273
if savings_pct < 10:
    self._ineffective_compression_count += 1  # Only increases
else:
    self._ineffective_compression_count = 0   # Only resets on effective compression

# on_session_reset() — line 298
self._ineffective_compression_count = 0  # Only resets on /new

The gap: If two consecutive compressions are ineffective (e.g., the middle region has few messages because most are in the protected head/tail), the counter hits 2 and never decreases. Subsequent context growth is completely ignored.

Trigger Scenario

  1. User has a conversation where the middle region is small (most messages are in the protected head=3 + tail=20)
  2. First compression saves only 8% → counter = 1
  3. Second compression saves only 5% → counter = 2
  4. Anti-thrashing kicks in, should_compress() returns False forever
  5. User continues the conversation, context grows to 90%+ of limit
  6. No auto-compression fires → context hits limit → forced degradation

Fix

Add a time-based auto-recovery: if enough time has passed since the last compression attempt, reset the counter. This preserves the anti-thrashing protection (preventing rapid-fire ineffective compressions) while allowing recovery when the conversation has grown significantly.

# In __init__:
self._last_compression_time: float = 0.0
self._ANTI_THRASH_RECOVERY_SECONDS: float = 300.0  # 5 minutes

# In should_compress():
if self._ineffective_compression_count >= 2:
    _elapsed = time.monotonic() - self._last_compression_time
    if _elapsed > self._ANTI_THRASH_RECOVERY_SECONDS:
        self._ineffective_compression_count = 0
        logger.info("Anti-thrashing reset: %.0fs since last compression attempt", _elapsed)
    else:
        return False

# In compress() — after updating savings:
self._last_compression_time = time.monotonic()

The 300-second (5-minute) recovery window is conservative enough to prevent thrashing while ensuring that a session isn't permanently locked out of compression.

Environment

  • Hermes Agent version: latest main (ce089169)
  • OS: Linux (ROCm)

extent analysis

TL;DR

Implement a time-based auto-recovery mechanism to reset the anti-thrashing counter after a specified time period since the last compression attempt.

Guidance

  • Introduce a new attribute self._last_compression_time to track the time of the last compression attempt.
  • Define a recovery time window (self._ANTI_THRASH_RECOVERY_SECONDS) to determine when to reset the counter.
  • Modify the should_compress() method to check the elapsed time since the last compression attempt and reset the counter if the recovery time window has passed.
  • Update the compress() method to record the current time after each compression attempt.

Example

self._last_compression_time = time.monotonic()
self._ANTI_THRASH_RECOVERY_SECONDS = 300.0  # 5 minutes

# In should_compress():
if self._ineffective_compression_count >= 2:
    _elapsed = time.monotonic() - self._last_compression_time
    if _elapsed > self._ANTI_THRASH_RECOVERY_SECONDS:
        self._ineffective_compression_count = 0
        logger.info("Anti-thrashing reset: %.0fs since last compression attempt", _elapsed)
    else:
        return False

Notes

The proposed fix assumes that the time.monotonic() function is available and suitable for measuring elapsed time. The choice of a 5-minute recovery window is conservative and may need to be adjusted based on specific use cases.

Recommendation

Apply the workaround by implementing the time-based auto-recovery mechanism, as it provides a balance between preventing thrashing and allowing recovery when the conversation has grown significantly.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix BUG: Anti-thrashing protection permanently disables auto-compression with no recovery [1 pull requests, 1 participants]