hermes - ✅(Solved) Fix [Bug]: Matrix Gateway: Race condition between auto-redaction and message delivery with high-speed models [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#19075Fetched 2026-05-04 05:18:15
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×1

Error Message

  1. The gateway logs confirm successful delivery (sent event) followed by redacted, but the user interface/logs report a delivery error. I expected to receive a non-error response to all my messages regardless of length. I even got this error when sending a simple "test" message.

Additional Logs / Traceback (optional)

Decouple redaction logic from the immediate message delivery/processing loop. Implementing a mandatory delay of say, 5-10 seconds before performing auto-redaction cleanup might allow the message delivery status to stabilize, preventing the race condition and the subsequent false-positive error reporting.

Root Cause

Root Cause Analysis (optional)

Fix Action

Fixed

PR fix notes

PR #19223: fix(matrix): defer reaction cleanup redactions

Description (problem / solution / changelog)

Summary

Fixes #19075.

Matrix reaction cleanup currently runs inline at processing completion. On fast model turns, that can place reaction redactions immediately alongside the final message delivery path, matching the issue report where disabling redaction avoids the false delivery/truncation error.

This PR:

  • defers Matrix reaction redactions by a short configurable delay (HERMES_MATRIX_REACTION_REDACTION_DELAY_SECONDS, default 5.0)
  • applies the delay to processing-status reaction cleanup and approval seed reaction cleanup
  • tracks delayed cleanup tasks so disconnect cancels them cleanly
  • keeps terminal success/failure reactions immediate while only delaying redaction cleanup

Verification

  • scripts/run_tests.sh tests/gateway/test_matrix.py::TestMatrixReactions
    • 9 passed

Additional check:

  • scripts/run_tests.sh tests/gateway/test_matrix.py
    • changed behavior passed, but the file still fails locally at TestMatrixProxyConfig::test_no_proxy_by_default because this macOS environment auto-detects system proxy http://127.0.0.1:10808; that failure is unrelated to this PR and occurs outside the modified reaction tests.

Changed files

  • gateway/platforms/matrix.py (modified, +57/-7)
  • tests/gateway/test_matrix.py (modified, +43/-2)

Code Example

I am not comfortable outputting all of the content in there as some of those logs included PII from a brief scan.

I am a human writing this, this report is not clanker slop reporting.

---
RAW_BUFFERClick to expand / collapse

Bug Description

NOTE: This is Matrix gateway related but your bug report form does not have a Matrix gateway option.

When using low-latency LLM models (I was using gemini-3.1-flash-lite-preview), the Matrix gateway would only respond with "response truncated due to output length limit," even when the actual content is well within the 4,000-character limit. This was me using the Element X client on android. This was really frustrating as I could not even self-diagnose from my phone while using Matrix and had to wait till I was back at my laptop to get access to hermes chat.

Steps to Reproduce

  1. Use a high-speed model (gemini-3.1-flash-lite-preview).
  2. Observe that the bot sends a message followed immediately by a redaction of system reactions.
  3. The gateway logs confirm successful delivery (sent event) followed by redacted, but the user interface/logs report a delivery error.
  4. Comment out the redaction, then try again. Message comes through fine.

Expected Behavior

I expected to receive a non-error response to all my messages regardless of length. I even got this error when sending a simple "test" message.

Actual Behavior

"response truncated due to output length limit"

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp), Other

Messaging Platform (if gateway-related)

No response

Debug Report

I am not comfortable outputting all of the content in there as some of those logs included PII from a brief scan.

I am a human writing this, this report is not clanker slop reporting.

Operating System

Ubuntu 24.04.4 LTS (Noble Numbat)

Python Version

3.12.3

Hermes Version

v2026.4.30-161-gf98b5d00a

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

This seems like it may be an architectural race condition in gateway/platforms/matrix.py. The MatrixAdapter performs automated cleanup (redacting 👀 processing reactions and bot-seeded approval ✅ ❌ buttons) in tight succession with the actual message delivery.

With high-speed models, the redaction request is reaching the Matrix homeserver before or immediately alongside the message delivery confirmation. This triggers a false-positive in the gateway's monitoring/tracking logic, where it interprets the sudden "event missing" state (due to redaction) as a failure or truncation of the primary message delivery.

Proposed Fix (optional)

Decouple redaction logic from the immediate message delivery/processing loop. Implementing a mandatory delay of say, 5-10 seconds before performing auto-redaction cleanup might allow the message delivery status to stabilize, preventing the race condition and the subsequent false-positive error reporting.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

extent analysis

TL;DR

Decoupling the redaction logic from the immediate message delivery loop by introducing a delay may resolve the "response truncated due to output length limit" error.

Guidance

  • Investigate the gateway/platforms/matrix.py file to understand the current implementation of the MatrixAdapter and its automated cleanup process.
  • Consider implementing a delay, as suggested, of 5-10 seconds before performing auto-redaction cleanup to prevent the race condition.
  • Verify the proposed fix by testing with high-speed models and checking for the error message.
  • Review the gateway's monitoring/tracking logic to ensure it correctly handles the "event missing" state after implementing the delay.

Example

No code snippet is provided due to the lack of specific implementation details in the issue.

Notes

The proposed fix assumes that the issue is indeed caused by a race condition between the message delivery and redaction processes. Further investigation may be necessary to confirm this.

Recommendation

Apply the proposed workaround by introducing a delay in the redaction logic, as this seems to be a plausible solution to the identified race condition.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING