hermes - ✅(Solved) Fix [Bug]: Discord adapter creates zombie websocket connection on reconnect, causing double responses [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#18187Fetched 2026-05-02 05:50:00
View on GitHub
Comments
0
Participants
1
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×3referenced ×1

When the gateway restarts (or the Discord adapter reconnects), DiscordAdapter.connect() creates a new commands.Bot client but never closes the old one. Discord doesn't immediately terminate the old websocket, leaving two live connections for a window of time. Both connections receive every incoming message, resulting in two separate agent turns being spawned — each generating a different response.

Root Cause

In gateway/platforms/discord.py, connect() unconditionally creates a new commands.Bot instance:

self._client = commands.Bot(
    command_prefix="!",
    intents=intents,
    ...
)

When connect() is called a second time (e.g. during reconnect in run.py line ~2848), the old self._client is orphaned — still connected to Discord's gateway — while the new client also connects. Both are alive simultaneously and both fire on_message for every event.

The MessageDeduplicator (per-adapter instance) cannot prevent duplicates because both websockets deliver the event independently, and the two on_message coroutines may check is_duplicate before either has marked the ID as seen (race condition).

Fix Action

Fix

Before creating the new Bot instance in connect(), close and await the old client if one exists:

# Add before: self._client = commands.Bot(...)
if self._client and not self._client.is_closed():
    await self._client.close()
    self._client = None
self._ready_event.clear()

This ensures only one Discord websocket connection is ever active for the adapter at any time.

Workaround

Avoid gateway restarts. The zombie connection eventually times out on its own (~minutes), after which responses return to normal until the next restart.

PR fix notes

PR #18224: fix(discord): close old client before reconnect to prevent zombie websockets

Description (problem / solution / changelog)

Summary

Fixes #18187 — Discord adapter creates zombie websocket connection on reconnect, causing double responses.

Problem

When DiscordAdapter.connect() is called during a reconnect cycle (e.g. gateway restart), it unconditionally creates a new commands.Bot client without closing the previous one. The old client's websocket remains connected to Discord's gateway, so both clients fire on_message for every incoming event — resulting in double responses with different wording.

The existing MessageDeduplicator cannot prevent this because the two websockets deliver events independently, and the two on_message coroutines race on the dedup check.

Fix

Before creating a new Bot instance in connect(), check if a previous client exists and close it:

if self._client is not None:
    try:
        if not self._client.is_closed():
            await self._client.close()
    except Exception:
        logger.debug("[%s] Failed to close previous Discord client", self.name)
    finally:
        self._client = None
        self._ready_event.clear()

This ensures only one Discord websocket connection is ever active for the adapter.

Scope

  • 1 file changed: gateway/platforms/discord.py (+15 lines)
  • All 375 Discord-related tests pass

Testing

python -m pytest tests/tools/test_discord_tool.py tests/gateway/ -k discord
# 375 passed, 1 skipped

Changed files

  • gateway/platforms/discord.py (modified, +15/-0)

PR #18297: fix(discord): close stale client before reconnect to prevent zombie websocket (#18187)

Description (problem / solution / changelog)

Problem

On gateway restart or adapter reconnect, connect() created a new commands.Bot instance without closing the previous one. Discord's gateway does not immediately terminate an orphaned websocket — both the old and new client remain live for a window of time. During that window, every inbound event is delivered to both connections independently.

MessageDeduplicator cannot prevent this because the two on_message coroutines may check is_duplicate before either has marked the ID as seen (race condition). The result: two separate agent turns are spawned per message, each producing a different response.

With auto_thread: true this is especially visible: one response lands in the thread (correct path) and a second response lands in the parent channel (incorrect path).

Fix

Before instantiating a new commands.Bot, await the old client's close() if it is not already closed, then clear _ready_event:

if self._client is not None and not self._client.is_closed():
    await self._client.close()
self._client = None
self._ready_event.clear()

This mirrors the guard already present in disconnect() (~line 779) and ensures only one Discord websocket is ever active for the adapter at any time.

Testing

  • Gateway restart no longer produces double responses in any channel configuration
  • auto_thread mode: only one response appears in the thread, no response leaks to the parent channel
  • Single connect() call on first start: no change in behavior (self._client is None, guard is a no-op)

Closes #18187

Changed files

  • gateway/platforms/discord.py (modified, +14/-0)

PR #17246: fix: resolve 7 identified issues [automated]

Description (problem / solution / changelog)

Summary

This automated PR resolves 7 identified open issues with focus on bugs, cross-platform reliability, and operational hardening.

Fixed issues

  1. #18594get_hermes_home() now fails fast in profile-scoped subprocesses when HERMES_HOME is missing (prevents silent cross-profile writes).
  2. #18588 — context compression now retries on the main model when summary_model_override is unset and summary model path fails.
  3. #18586delegate_task now passes target_model into runtime provider resolution, fixing wrong api_mode/base_url for providers like opencode-go.
  4. #18187 — Discord adapter now closes any existing bot client before reconnecting, preventing duplicate websocket consumers and double responses.
  5. #18221 — QQBot _open_ws() now bounds stale websocket/session close operations with timeouts to avoid reconnect hangs.
  6. #18437 — Weixin direct send now avoids reusing live adapters across event loops; falls back safely to one-shot adapter/session path.
  7. #18485 — Slack channel directory now warns once per team and downgrades repeated failures to debug, reducing recurring gateway log noise.

Files changed

  • hermes_constants.py
  • agent/context_compressor.py
  • tools/delegate_tool.py
  • gateway/platforms/discord.py
  • gateway/platforms/qqbot/adapter.py
  • gateway/platforms/weixin.py
  • gateway/channel_directory.py

Notes

  • Commits were created with descriptive English messages.
  • Push was performed at the end after all fixes were committed.

Changed files

  • Dockerfile (modified, +2/-1)
  • acp_adapter/session.py (modified, +12/-0)
  • agent/auxiliary_client.py (modified, +280/-28)
  • agent/context_compressor.py (modified, +496/-52)
  • agent/title_generator.py (modified, +2/-2)
  • agent/transports/chat_completions.py (modified, +14/-0)
  • agent/usage_pricing.py (modified, +4/-0)
  • cli-config.yaml.example (modified, +5/-0)
  • cli.py (modified, +27/-3)
  • cron/scheduler.py (modified, +8/-2)
  • docker/entrypoint.sh (modified, +5/-1)
  • gateway/channel_directory.py (modified, +14/-4)
  • gateway/platforms/discord.py (modified, +33/-7)
  • gateway/platforms/email.py (modified, +12/-2)
  • gateway/platforms/feishu.py (modified, +34/-1)
  • gateway/platforms/qqbot/adapter.py (modified, +8/-2)
  • gateway/platforms/telegram_network.py (modified, +7/-2)
  • gateway/platforms/weixin.py (modified, +10/-1)
  • gateway/run.py (modified, +99/-32)
  • gateway/status.py (modified, +8/-1)
  • hermes_cli/auth.py (modified, +1/-1)
  • hermes_cli/commands.py (modified, +1/-1)
  • hermes_cli/config.py (modified, +271/-40)
  • hermes_cli/copilot_auth.py (modified, +1/-1)
  • hermes_cli/gateway.py (modified, +16/-13)
  • hermes_cli/main.py (modified, +69/-3)
  • hermes_cli/memory_setup.py (modified, +1/-1)
  • hermes_cli/model_switch.py (modified, +6/-1)
  • hermes_cli/models.py (modified, +59/-1)
  • hermes_cli/profiles.py (modified, +16/-3)
  • hermes_cli/runtime_provider.py (modified, +16/-13)
  • hermes_cli/setup.py (modified, +8/-2)
  • hermes_cli/slack_cli.py (modified, +1/-2)
  • hermes_cli/status.py (modified, +17/-2)
  • hermes_cli/web_server.py (modified, +1/-1)
  • hermes_constants.py (modified, +16/-3)
  • model_tools.py (modified, +44/-13)
  • run_agent.py (modified, +389/-82)
  • setup-hermes.sh (modified, +23/-12)
  • skills/red-teaming/godmode/scripts/load_godmode.py (modified, +9/-8)
  • tests/agent/test_context_compressor.py (modified, +389/-0)
  • tests/gateway/test_compress_command.py (modified, +49/-0)
  • tests/run_agent/test_413_compression.py (modified, +81/-1)
  • tests/run_agent/test_compression_boundary_hook.py (modified, +42/-0)
  • tests/run_agent/test_run_agent.py (modified, +100/-13)
  • tests/tools/test_skill_manager_tool.py (modified, +270/-0)
  • tools/approval.py (modified, +1/-1)
  • tools/delegate_tool.py (modified, +4/-1)
  • tools/environments/docker.py (modified, +36/-5)
  • tools/environments/local.py (modified, +7/-1)
  • tools/file_operations.py (modified, +70/-67)
  • tools/file_tools.py (modified, +4/-1)
  • tools/send_message_tool.py (modified, +66/-2)
  • tools/session_search_tool.py (modified, +2/-2)
  • tools/skill_manager_tool.py (modified, +82/-21)
  • tools/skills_tool.py (modified, +13/-1)
  • tools/terminal_tool.py (modified, +6/-0)
  • tools/tool_backend_helpers.py (modified, +15/-5)
  • tools/tts_tool.py (modified, +27/-16)
  • tools/voice_mode.py (modified, +23/-10)
  • tui_gateway/server.py (modified, +5/-3)
  • ui-tui/src/app/turnController.ts (modified, +1/-1)
  • ui-tui/src/app/useInputHandlers.ts (modified, +8/-3)
  • ui-tui/src/app/useSessionLifecycle.ts (modified, +1/-1)
  • ui-tui/src/gatewayTypes.ts (modified, +1/-0)
  • utils.py (modified, +9/-0)
  • uv.lock (modified, +161/-2)

Code Example

inbound message: platform=discord user=X chat=Y msg='hello'
  inbound message: platform=discord user=X chat=Y msg='hello'~400ms later

---

self._client = commands.Bot(
    command_prefix="!",
    intents=intents,
    ...
)

---

# Add before: self._client = commands.Bot(...)
if self._client and not self._client.is_closed():
    await self._client.close()
    self._client = None
self._ready_event.clear()
RAW_BUFFERClick to expand / collapse

Description

When the gateway restarts (or the Discord adapter reconnects), DiscordAdapter.connect() creates a new commands.Bot client but never closes the old one. Discord doesn't immediately terminate the old websocket, leaving two live connections for a window of time. Both connections receive every incoming message, resulting in two separate agent turns being spawned — each generating a different response.

Symptoms

  • Every Discord message triggers two responses with different wording (not a duplicate of the same response)
  • When auto_thread is enabled: one response appears in the auto-created thread (correct), a second response appears directly in the parent channel (incorrect)
  • Gateway log shows the same message arriving twice ~400ms apart:
    inbound message: platform=discord user=X chat=Y msg='hello'
    inbound message: platform=discord user=X chat=Y msg='hello'   ← ~400ms later
  • Only one gateway process is running (ps aux confirms)
  • MessageDeduplicator exists and is correctly placed, but fails due to the race condition between two concurrent websocket deliveries

Root Cause

In gateway/platforms/discord.py, connect() unconditionally creates a new commands.Bot instance:

self._client = commands.Bot(
    command_prefix="!",
    intents=intents,
    ...
)

When connect() is called a second time (e.g. during reconnect in run.py line ~2848), the old self._client is orphaned — still connected to Discord's gateway — while the new client also connects. Both are alive simultaneously and both fire on_message for every event.

The MessageDeduplicator (per-adapter instance) cannot prevent duplicates because both websockets deliver the event independently, and the two on_message coroutines may check is_duplicate before either has marked the ID as seen (race condition).

Fix

Before creating the new Bot instance in connect(), close and await the old client if one exists:

# Add before: self._client = commands.Bot(...)
if self._client and not self._client.is_closed():
    await self._client.close()
    self._client = None
self._ready_event.clear()

This ensures only one Discord websocket connection is ever active for the adapter at any time.

Environment

  • Hermes gateway running in Docker container
  • Discord platform adapter
  • auto_thread: true in config (makes the symptom very visible — thread response + channel response)
  • Triggered by any gateway restart or reconnect cycle

Workaround

Avoid gateway restarts. The zombie connection eventually times out on its own (~minutes), after which responses return to normal until the next restart.

extent analysis

TL;DR

Close the old Discord client instance before creating a new one in the connect() method to prevent duplicate responses.

Guidance

  • Verify that the MessageDeduplicator is correctly placed and functioning as expected to prevent duplicates in normal operation.
  • Check the gateway logs to confirm that the duplicate messages are arriving ~400ms apart, indicating the presence of two live connections.
  • Implement the proposed fix in gateway/platforms/discord.py to ensure only one Discord websocket connection is active at any time.
  • Test the fix by restarting the gateway or triggering a reconnect cycle to verify that duplicate responses are no longer generated.

Example

if self._client and not self._client.is_closed():
    await self._client.close()
    self._client = None
self._ready_event.clear()

Notes

This fix assumes that the connect() method is the only point where a new commands.Bot instance is created. If there are other places where a new instance is created, those will also need to be updated to close the old client first.

Recommendation

Apply the proposed workaround by closing the old client instance before creating a new one, as this directly addresses the root cause of the issue and prevents duplicate responses.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING