hermes - ✅(Solved) Fix [Bug]: Matrix "Hall of Mirrors": System/Bridge events trigger recursive pairing loops [1 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15763Fetched 2026-04-26 05:25:14
View on GitHub
Comments
1
Participants
1
Timeline
6
Reactions
0
Participants
Timeline (top)
labeled ×4commented ×1cross-referenced ×1

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fixed

PR fix notes

PR #15822: fix(matrix): drop bridge / appservice ghost senders before pairing flow

Description (problem / solution / changelog)

What does this PR do?

Fixes the "hall of mirrors" loop reported in #15763. Matrix application-service ghost users (Telegram, IRC, Signal, … bridges) puppet remote users via MXIDs in their declared namespace and deliver system-level events (status notices, "interruption" relays) with valid-but-unauthorized MXIDs. The runner only short-circuited on source.user_id is None, so these events triggered the unauthorized-DM pairing flow; once an operator approved one, every relay re-emission of the gateway's outbound traffic was processed as a new user turn — an infinite agent-response loop with no workaround.

The fix layers two defenses, matching the issue author's "Integrated Defense" proposal:

  1. Prevention — recognize bridge / appservice ghost senders by MXID prefix (default @_, the namespacing convention from the Matrix appservice spec) and an operator allowlist, and drop them before the pairing flow at both the adapter (MatrixAdapter._on_room_message) and the runner (GatewayRunner._is_system_identity). The runner check is placed before authorization so an accidentally-paired bridge user is still neutralized at runtime.
  2. Robustness — the adapter keeps a bounded ring of recently-sent event IDs and drops inbound events whose ID matches one we just sent, catching relay reflections regardless of sender.

internal=True synthetic events continue to bypass both gates (regression for #6540 / cac61781).

Related Issue

Fixes #15763

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • gateway/platforms/matrix.py — module-level is_matrix_bridge_sender / _parse_bridge_prefixes / _parse_bridge_users helpers; new MATRIX_BRIDGE_PREFIXES (default @_) and MATRIX_BRIDGE_USERS env vars documented in the module docstring; MatrixAdapter parses both at init, records every outbound event_id from send() into a bounded (maxlen=512) deque + lookup set, and drops bridge senders / outbound echoes inside _on_room_message.
  • gateway/run.py — new GatewayRunner._is_system_identity(source) dispatching by platform; _handle_message now drops system identities before _is_user_authorized, between the existing user_id is None short-circuit and the pairing branch.
  • tests/gateway/test_matrix_bridge_filter.py — 18 cases covering helper parsing, default prefix, env-var allowlists, custom prefix overrides, case-insensitive matching, the real-user pairing regression guard, pre-paired bridge user, internal=True bypass, group-room echoes, ring eviction, idempotent recording, and empty-event-id handling.

How to Test

  1. Run the new suite: pytest tests/gateway/test_matrix_bridge_filter.py -v → 18 passed.
  2. Regression suite for the affected area:
    pytest tests/gateway/test_matrix_bridge_filter.py \
           tests/gateway/test_unauthorized_dm_behavior.py \
           tests/gateway/test_internal_event_bypass_pairing.py \
           tests/gateway/test_pairing.py \
           tests/gateway/test_matrix.py \
           tests/gateway/test_matrix_voice.py \
           tests/gateway/test_matrix_mention.py \
           tests/gateway/test_auth_fallback.py -q
    → 242 passed.
  3. Manual reproduction of the issue scenario: with a bridge MXID (@_telegram_bridge_:matrix.org) sending a DM to the gateway, pairing_store.generate_code is not called and no pairing message is sent. Same outcome even when pairing_store.is_approved is True — covers the case where the operator already paired a bridge before this fix lands.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass (the pre-existing failures in test_discord_free_response.py and test_whatsapp_connect.py are unrelated and reproduce on main)
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: Ubuntu (Linux 5.15), Python 3.11.15

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — added the two new env vars to the MatrixAdapter module docstring
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A (env-var only, no YAML key)
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — pure stdlib (collections.deque, env vars, string ops); no platform-specific behaviour
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

Changed files

  • gateway/platforms/matrix.py (modified, +109/-0)
  • gateway/run.py (modified, +41/-0)
  • tests/gateway/test_matrix_bridge_filter.py (added, +370/-0)

Code Example

Report       https://paste.rs/eGnFv
agent.log    https://paste.rs/60CFe
gateway.log  https://paste.rs/1a6ef

---

Operation interrupted: waiting for model response (0.9s elapsed).
Too many pairing requests right now~ Please try again later!
RAW_BUFFERClick to expand / collapse

Bug Description

A recursive loop occurs in Matrix DMs and rooms when a system-level or bridge-level event (e.g., "Interruptions") is delivered as a MessageEvent with a valid, but unauthorized, user_id. The gateway treats this as a new user requesting a pairing, and once the pairing is approved, the gateway's own outbound messages are relayed back as incoming messages from that authorized "system" user, triggering an infinite loop of agent responses.

Steps to Reproduce

  1. A system-level event occurs in a Matrix room/DM (e.g., an "interruption" event) that carries a user_id (not None).
  2. The hermes-gateway receives this event and, finding the user_id unauthorized, triggers a pairing request: Hi~ I don't recognize you yet! ....
  3. The user approves this "system" user via hermes pairing approve matrix <code_from_message>.
  4. The gateway is now authorized to interact with this "system" user.
  5. Every time the gateway sends a message, the Matrix adapter relays it back to the gateway as an incoming message from the now-authorized "system" user.
  6. The agent sees these "incoming" messages and responds, creating a continuous loop of messages.

Expected Behavior

The gateway should neither be triggered for pairing by system/bridge identities nor respond to "echoes" of its own outbound traffic.

Actual Behavior

The gateway is triggered for pairing by system-level events and then falls into a "hall of mirrors" loop by responding to its own echoed messages.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp)

Messaging Platform (if gateway-related)

N/A (CLI only)

Debug Report

Report       https://paste.rs/eGnFv
agent.log    https://paste.rs/60CFe
gateway.log  https://paste.rs/1a6ef

Operating System

Linux 6.12.63+deb13-amd64 x86_64

Python Version

3.11.15

Hermes Version

0.11.0 (2026.4.23) [a9fa73a6]

Additional Logs / Traceback (optional)

Operation interrupted: waiting for model response (0.9s elapsed).
Too many pairing requests right now~ Please try again later!

Root Cause Analysis (optional)

The bug is in gateway/run.py lines 3103-3109. The check elif source.user_id is None: is insufficient because Matrix system/bridge events often carry a valid user_id. When not self._is_user_authorized(source) (line 3110) evaluates to True for these IDs, the pairing workflow is triggered. Once the session is authorized, the gateway's lack of outbound-echo filtering allows the loop to persist.

Possibly related to https://github.com/NousResearch/hermes-agent/pull/6540

Proposed Fix (optional)

A two-pronged "Integrated Defense" approach:

  1. Primary (Prevention): Expand the source.user_id is None check in gateway/run.py to also include known platform-specific system or bridge identities to prevent the initial pairing trigger.
  2. Secondary (Robustness): Implement outbound-echo filtering. The gateway should track the message_ids of recently sent outbound messages and explicitly ignore any incoming MessageEvents that are relays of these IDs.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

extent analysis

TL;DR

The most likely fix involves modifying the gateway to prevent pairing triggers from system-level events and implementing outbound-echo filtering to break the recursive loop.

Guidance

  • Review the gateway/run.py lines 3103-3109 to understand the current pairing trigger logic and consider expanding the source.user_id is None check to include known platform-specific system or bridge identities.
  • Implement outbound-echo filtering by tracking the message_ids of recently sent outbound messages and ignoring any incoming MessageEvents that are relays of these IDs.
  • Consider referencing the proposed fix in the issue and the related pull request https://github.com/NousResearch/hermes-agent/pull/6540 for more context.
  • Before making changes, ensure you have a clear understanding of the gateway's logic and the implications of modifying the pairing trigger and echo filtering mechanisms.

Example

No code snippet is provided due to the complexity of the issue and the need for a thorough review of the gateway's logic.

Notes

The provided Root Cause Analysis and Proposed Fix sections offer valuable insights into the issue. However, it's essential to carefully evaluate these suggestions and consider potential side effects before implementing any changes.

Recommendation

Apply the proposed "Integrated Defense" approach, which includes both preventing the initial pairing trigger and implementing outbound-echo filtering, as it addresses the root cause of the issue and provides a robust solution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Bug]: Matrix "Hall of Mirrors": System/Bridge events trigger recursive pairing loops [1 pull requests, 1 comments, 1 participants]