hermes - 💡(How to fix) Fix Multi-bot free-response channels enter infinite ack-loops; bot-allow guard bypassed; human STOP ignored

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

A live death-loop was triggered today (2026-05-26 09:18 EDT) in a multi-bot Discord channel between two Hermes profiles (default/Ghost and syn/Syn), confirming a structural safety defect with three distinct layers. The same trigger topology exists today in at least 6 other channels across our deployment; we've manually hardened them by removing free-response (forcing @mention) but the underlying bug should be fixed upstream so the safe-by-default behavior protects all operators.

Root Cause

A live death-loop was triggered today (2026-05-26 09:18 EDT) in a multi-bot Discord channel between two Hermes profiles (default/Ghost and syn/Syn), confirming a structural safety defect with three distinct layers. The same trigger topology exists today in at least 6 other channels across our deployment; we've manually hardened them by removing free-response (forcing @mention) but the underlying bug should be fixed upstream so the safe-by-default behavior protects all operators.

Fix Action

Fix / Workaround

Until fixes land, any operator running >1 Hermes profile with overlapping free_response_channels configuration has a latent runaway hazard. Mitigation in our deployment: removed free-response from all 6 multi-bot channels in our fleet, forcing @mention. This costs the no-mention multi-agent collaboration convenience but eliminates the structural risk.

RAW_BUFFERClick to expand / collapse

Multi-bot free-response channels enter infinite ack-loops; DISCORD_ALLOW_BOTS=mentions bypassed; human STOP signals ignored

Summary

A live death-loop was triggered today (2026-05-26 09:18 EDT) in a multi-bot Discord channel between two Hermes profiles (default/Ghost and syn/Syn), confirming a structural safety defect with three distinct layers. The same trigger topology exists today in at least 6 other channels across our deployment; we've manually hardened them by removing free-response (forcing @mention) but the underlying bug should be fixed upstream so the safe-by-default behavior protects all operators.

Three-layer defect

Layer 1 — Config-semantics defect

DISCORD_ALLOW_BOTS=mentions (env var, set on all 13 profiles in our deployment) is documented to gate bot-to-bot replies behind an explicit @mention. But discord.free_response_channels silently overrides this. In any channel listed in free_response_channels, bot messages from other Hermes profiles trigger replies WITHOUT @mention.

Expected: DISCORD_ALLOW_BOTS=mentions is the canonical guard regardless of channel config.

Actual: free_response_channels bypasses it. Result: any multi-bot channel listed in free_response_channels is a structural ack-loop hazard.

Layer 2 — Conversation-safety defect

When two or more Hermes profiles are subscribed to the same free_response_channel, there is no anti-loop guard. They will continue to reply to each other indefinitely. In our incident:

  • 09:18:55 Ghost → "Locked."
  • 09:18:55 Syn → "State unchanged."
  • 09:19:00 Ghost → "Ack."
  • 09:19:09 Syn → "No change. DO"
  • 09:19:13 Ghost → "Ack."
  • 09:19:21 Syn → "State unchanged. DO"
  • 09:19:23 Ghost → "Ack."
  • ... (continued until operator intervention + force-kill)

Loops can be triggered by ack-pattern messages (single words, terminal-decision phrases) and semantic agreement loops (longer paraphrases that re-affirm prior state without adding new content). A content-only filter would catch the former but miss the latter.

Layer 3 — Operator-interrupt defect

At 09:19:38 the human operator posted: <@bot_id> you are in a loop, stop

The bot processed this inbound message and continued to reply (96-char response) anyway, treating it as just another conversation turn. There is no recognized human STOP / HALT / KILL signal at the gateway layer that suspends a profile's auto-reply behavior. The only working remediation was launchctl bootout + pkill -9 from the host.

This means: in a runaway, the operator has no in-band escape hatch. They must reach the host. For a tool that proxies through Discord, this is a critical safety gap.

Reproduction

  1. Two Hermes profiles (A and B) running. Both have DISCORD_ALLOW_BOTS=mentions set in .env.
  2. Add channel ID C to discord.free_response_channels in both profiles' config.yaml.
  3. Restart both gateways.
  4. Operator (any third sender — user or third bot) posts a question in channel C that mentions both bots.
  5. Both reply. Both see each other's replies in free_response_channels mode → both reply again. Loop.
  6. Operator posts unmentioned text like "stop", "halt", or "you are in a loop" — bots ignore the semantic content and continue replying.

Verified on Hermes 0.14.0 (2026.5.16), Discord platform adapter.

Suggested fixes (in priority order)

  1. (L1) DISCORD_ALLOW_BOTS=mentions must apply unconditionally. free_response_channels removes the human-@-mention requirement, but should NOT remove the bot-source filter. Treat these as independent gates: user messages → free response (if channel listed), bot messages → still require @mention (regardless of channel listing). Document the precedence clearly.

  2. (L2) Add a per-channel anti-loop circuit breaker at gateway. Minimum viable: after N consecutive replies in the same channel where the inbound was from another Hermes bot, suspend auto-reply for that channel for M minutes. Suggested defaults: N=3, M=10. Loop topology covers many content patterns; topological cap is safer than content filtering.

  3. (L3) Add a documented in-band operator HALT signal. When a user message in a bot-accessible channel matches a configurable pattern (default: ^(STOP|HALT|KILL|FREEZE)\b from a known operator user-ID), suspend that profile's auto-reply across all channels for M minutes. This gives operators a real-time emergency stop that doesn't require host access.

Operator impact

Until fixes land, any operator running >1 Hermes profile with overlapping free_response_channels configuration has a latent runaway hazard. Mitigation in our deployment: removed free-response from all 6 multi-bot channels in our fleet, forcing @mention. This costs the no-mention multi-agent collaboration convenience but eliminates the structural risk.

Notes

Tested on:

  • Hermes 0.14.0 (2026.5.16)
  • 13-profile fleet
  • macOS Sequoia host, Tailscale-routed Discord traffic
  • Full state captured locally; happy to share gateway.log excerpts on request.

Companion issue (related but distinct): #32790 — also describes how free_response_channels interacts surprisingly with other guards.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Multi-bot free-response channels enter infinite ack-loops; bot-allow guard bypassed; human STOP ignored