hermes - ✅(Solved) Fix MessageDeduplicator TTL never expires entries unless the cache overflows [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#13829Fetched 2026-04-23 07:48:43
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Timeline (top)
labeled ×3commented ×1

MessageDeduplicator claims to be TTL-based, but is_duplicate() never expires old entries on lookup. A message ID remains a duplicate indefinitely until the cache exceeds max_size and pruning happens.

Root Cause

Actual behavior

The second call still returns True because the stale entry is never revalidated.

PR fix notes

PR #10345: fix(gateway): purge MessageDeduplicator entries by TTL on every lookup

Description (problem / solution / changelog)

Fixes #10306

Problem

MessageDeduplicator.is_duplicate() in gateway/platforms/helpers.py advertises TTL-based expiry but only purges expired entries when len(self._seen) > self._max_size. In low-traffic deployments where the cache never reaches max_size, stale entries accumulate indefinitely — a message sent again after the TTL window would still be incorrectly deduplicated.

Fix

Purge expired entries unconditionally at the start of every is_duplicate() call, before the existence check. The existing max_size guard is kept as a defensive cap.

Tests

  • Added/extended tests/gateway/test_message_deduplicator.py with a regression test that verifies TTL expiry without cache overflow

Changed files

  • gateway/platforms/helpers.py (modified, +18/-10)
  • tests/gateway/test_message_deduplicator.py (modified, +41/-83)

Code Example

now = time.time()
if msg_id in self._seen:
    return True
self._seen[msg_id] = now
if len(self._seen) > self._max_size:
    cutoff = now - self._ttl
    self._seen = {k: v for k, v in self._seen.items() if v > cutoff}

---

import time
from gateway.platforms.helpers import MessageDeduplicator

d = MessageDeduplicator(ttl_seconds=0.1)
print(d.is_duplicate('x'))
time.sleep(0.2)
print(d.is_duplicate('x'))

---

False
True
RAW_BUFFERClick to expand / collapse

Summary

MessageDeduplicator claims to be TTL-based, but is_duplicate() never expires old entries on lookup. A message ID remains a duplicate indefinitely until the cache exceeds max_size and pruning happens.

Affected code

  • gateway/platforms/helpers.py:25-57

Why this is a bug

The docstring and method contract say the cache is TTL-based and should only treat IDs as duplicates within the TTL window.

Current logic:

now = time.time()
if msg_id in self._seen:
    return True
self._seen[msg_id] = now
if len(self._seen) > self._max_size:
    cutoff = now - self._ttl
    self._seen = {k: v for k, v in self._seen.items() if v > cutoff}

This means TTL is only applied during overflow cleanup. On normal/low-volume adapters, a seen message can stay marked as duplicate forever.

Minimal reproduction

import time
from gateway.platforms.helpers import MessageDeduplicator

d = MessageDeduplicator(ttl_seconds=0.1)
print(d.is_duplicate('x'))
time.sleep(0.2)
print(d.is_duplicate('x'))

Observed output:

False
True

Expected behavior

After waiting longer than the TTL, the second call should return False and refresh the timestamp.

Actual behavior

The second call still returns True because the stale entry is never revalidated.

Suggested investigation

  • On lookup, compare the stored timestamp against now - ttl before returning True.
  • Add a small regression test that exercises expiry without requiring the cache to overflow.

extent analysis

TL;DR

Update the is_duplicate() method to check the TTL on each lookup, not just during cache overflow.

Guidance

  • Modify the is_duplicate() method to compare the stored timestamp against the current time minus the TTL before returning True.
  • Add a check to update the stored timestamp when a message ID is found to be not a duplicate after the TTL has expired.
  • Create a regression test to verify the expiry behavior without relying on cache overflow.
  • Review the MessageDeduplicator class to ensure it correctly handles edge cases, such as a TTL of 0 or a negative value.

Example

def is_duplicate(self, msg_id):
    now = time.time()
    if msg_id in self._seen:
        if now - self._seen[msg_id] > self._ttl:
            del self._seen[msg_id]
            return False
        return True
    self._seen[msg_id] = now
    # ... (rest of the method remains the same)

Notes

The provided code snippet only addresses the is_duplicate() method and may require additional changes to ensure the MessageDeduplicator class functions correctly in all scenarios.

Recommendation

Apply the suggested workaround by updating the is_duplicate() method to check the TTL on each lookup, as this directly addresses the reported issue and ensures the cache behaves as expected.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

After waiting longer than the TTL, the second call should return False and refresh the timestamp.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix MessageDeduplicator TTL never expires entries unless the cache overflows [1 pull requests, 1 comments, 2 participants]