hermes - ✅(Solved) Fix Bug: TUI session eagerly spawns duplicate 'hermes mcp serve' children from both tui_gateway.entry and slash_worker [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15275Fetched 2026-04-25 06:23:18
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
labeled ×4cross-referenced ×1

A single TUI session appears to eagerly spawn two separate hermes mcp serve subprocesses during normal operation:

  • one under tui_gateway.entry
  • one under tui_gateway.slash_worker

This is distinct from the already-known orphan-cleanup problem. In this case, the duplicate children are live and parented, not stale zombies.

The result is unnecessary subprocess fan-out, extra MCP sessions, and likely contribution to downstream contention issues such as intermittent SQLite WAL write pressure (fact_store lock symptoms) and avoidable resource growth.

Root Cause

This does not appear to be an intentional isolation boundary. The observed behavior is that one logical TUI session creates two Hermes MCP server children because both startup paths reach MCP discovery / tool bootstrap.

Fix Action

Fix / Workaround

Existing mitigation is insufficient

There is already a restart-scoped cleanup mitigation:

PR fix notes

PR #15440: fix(tui): suppress MCP discovery in slash_worker to prevent duplicate serve children (#15275)

Description (problem / solution / changelog)

Summary

Fixes #15275.

The TUI spawns duplicate hermes mcp serve child processes per session. Both tui_gateway/server.py (agent creation path) and tui_gateway/slash_worker.py (slash command worker) independently bootstrap MCP tools, each triggering discover_mcp_tools() and spawning its own set of MCP server processes. The slash worker only needs CLI command processing, not MCP-backed tools.

Changes

  • tui_gateway/slash_worker.py: Set HERMES_MCP_DISCOVERY=0 unconditionally before importing/constructing HermesCLI, suppressing MCP discovery in the worker process.
  • tui_gateway/server.py: When spawning the slash worker subprocess, explicitly set HERMES_MCP_DISCOVERY=0 in the subprocess environment to prevent parent env leakage (e.g., if parent has HERMES_MCP_DISCOVERY=1).
  • model_tools.py: Added a guard in the MCP discovery path that checks HERMES_MCP_DISCOVERY env var — skips discover_mcp_tools() when set to "0".

Validation

Tests: tests/tui_gateway/test_slash_worker_mcp.py — 6 tests covering:

  • Env var is set before CLI import in worker
  • CLI is not imported at module level (AST check)
  • Guard logic for both suppressed and default paths
  • Subprocess env override at spawn time

Tested on macOS (Python 3.11).

Changed files

  • model_tools.py (modified, +13/-6)
  • tests/tui_gateway/test_slash_worker_mcp.py (added, +128/-0)
  • tui_gateway/server.py (modified, +4/-1)
  • tui_gateway/slash_worker.py (modified, +27/-7)

Code Example

mcp_servers:
  hermes:
    command: hermes
    args: [mcp, serve]

---

PID=3643062 PPID=3643047  child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3643966 PPID=3643953  child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3656716 PPID=3656702  child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3681559 PPID=3642998  child of python3 -m tui_gateway.entry
PID=3681598 PPID=3643872  child of python3 -m tui_gateway.entry
PID=3681628 PPID=3656657  child of python3 -m tui_gateway.entry
PID=3674837 PPID=3674807  child of direct /usr/bin/python3 /home/openclaw/.local/bin/hermes
PID=3677397 PPID=3677390  child of direct /usr/bin/python3 /home/openclaw/.local/bin/hermes --resume ...

---

ExecStartPre=/bin/bash -c 'pkill -f "hermes.*mcp" || true'
RAW_BUFFERClick to expand / collapse

Summary

A single TUI session appears to eagerly spawn two separate hermes mcp serve subprocesses during normal operation:

  • one under tui_gateway.entry
  • one under tui_gateway.slash_worker

This is distinct from the already-known orphan-cleanup problem. In this case, the duplicate children are live and parented, not stale zombies.

The result is unnecessary subprocess fan-out, extra MCP sessions, and likely contribution to downstream contention issues such as intermittent SQLite WAL write pressure (fact_store lock symptoms) and avoidable resource growth.

Why this looks like a bug

This does not appear to be an intentional isolation boundary. The observed behavior is that one logical TUI session creates two Hermes MCP server children because both startup paths reach MCP discovery / tool bootstrap.

That is incorrect lifecycle behavior, not merely a performance enhancement request.

Environment

  • Repo: NousResearch/hermes-agent
  • Host: Linux VPS
  • Hermes TUI/gateway in active use
  • Config includes Hermes itself as an MCP server via ~/.hermes/config.yaml

Relevant config shape:

mcp_servers:
  hermes:
    command: hermes
    args: [mcp, serve]

Evidence

Code-path inspection showed:

  • model_tools.py calls discover_mcp_tools() at import time
  • tui_gateway/server.py creates one persistent slash worker per TUI session
  • tui_gateway/slash_worker.py creates one HermesCLI per TUI session
  • both tui_gateway.entry and tui_gateway.slash_worker therefore appear able to trigger MCP discovery / stdio server startup

Live process mapping showed 8 active hermes mcp serve children at one point:

  • 3 under python3 -m tui_gateway.slash_worker
  • 3 under python3 -m tui_gateway.entry
  • 2 under direct hermes / hermes --resume sessions

Representative mapping from ps:

PID=3643062 PPID=3643047  child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3643966 PPID=3643953  child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3656716 PPID=3656702  child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3681559 PPID=3642998  child of python3 -m tui_gateway.entry
PID=3681598 PPID=3643872  child of python3 -m tui_gateway.entry
PID=3681628 PPID=3656657  child of python3 -m tui_gateway.entry
PID=3674837 PPID=3674807  child of direct /usr/bin/python3 /home/openclaw/.local/bin/hermes
PID=3677397 PPID=3677390  child of direct /usr/bin/python3 /home/openclaw/.local/bin/hermes --resume ...

The important part is not the absolute count, but the topology: each inspected TUI session effectively had two Hermes MCP children, one from entry and one from slash worker.

Existing mitigation is insufficient

There is already a restart-scoped cleanup mitigation:

ExecStartPre=/bin/bash -c 'pkill -f "hermes.*mcp" || true'

That helps clean up before a gateway restart, but it does not prevent normal runtime duplication during active sessions.

Expected behavior

For a normal TUI session, Hermes should either:

  1. create one shared MCP stdio child for the session, or
  2. explicitly avoid MCP discovery in one of the two startup paths unless needed

A single logical session should not eagerly double-spawn Hermes MCP subprocesses by default.

Actual behavior

Both tui_gateway.entry and tui_gateway.slash_worker appear to reach MCP bootstrap, causing duplicate hermes mcp serve children during ordinary session startup.

Impact

  • unnecessary subprocess duplication
  • extra MCP sessions and pipe handles
  • avoidable memory / process growth over time
  • likely contributor to transient lock/contention symptoms in other subsystems
  • operational confusion, because restart cleanup can hide the symptom without fixing the source

Suspected root cause

The likely root cause is the combination of:

  • Hermes being configured as an MCP server in config.yaml
  • eager discover_mcp_tools() in model_tools.py
  • slash_worker creating its own HermesCLI
  • both the main TUI path and slash-worker path performing tool bootstrap independently

Proposed direction

Near-term safe fix:

  • prevent slash_worker from eagerly triggering MCP discovery unless it actually needs MCP-backed tools

More durable architectural fix:

  • make MCP server lifecycle shared / singleton per relevant scope, instead of per bootstrap path

Related issues

This seems related to, but distinct from:

  • #11202 — gateway leaks stdio-MCP subprocess children over time (cleanup / orphan problem)
  • #11115 — lazy non-core discovery for faster first tool-enabled turn

This issue is specifically about duplicate creation during normal TUI session startup, not just orphan reaping or performance tuning.

extent analysis

TL;DR

Prevent slash_worker from eagerly triggering MCP discovery unless it actually needs MCP-backed tools to avoid duplicate hermes mcp serve subprocesses.

Guidance

  • Review the model_tools.py and tui_gateway/slash_worker.py code to understand how discover_mcp_tools() is called and how HermesCLI is created.
  • Consider adding a check in slash_worker to only trigger MCP discovery when necessary, as a near-term safe fix.
  • Investigate making the MCP server lifecycle shared or singleton per relevant scope for a more durable architectural fix.
  • Verify the fix by monitoring the process mapping and checking for duplicate hermes mcp serve children.

Example

No code snippet is provided as the issue does not contain enough information to generate a specific code example.

Notes

The proposed direction is to prevent slash_worker from eagerly triggering MCP discovery, but the actual implementation details are not provided. The issue is specific to the NousResearch/hermes-agent repository and the hermes configuration.

Recommendation

Apply the near-term safe fix by preventing slash_worker from eagerly triggering MCP discovery unless it actually needs MCP-backed tools, as this is a more targeted and less invasive change.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

For a normal TUI session, Hermes should either:

  1. create one shared MCP stdio child for the session, or
  2. explicitly avoid MCP discovery in one of the two startup paths unless needed

A single logical session should not eagerly double-spawn Hermes MCP subprocesses by default.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix Bug: TUI session eagerly spawns duplicate 'hermes mcp serve' children from both tui_gateway.entry and slash_worker [1 pull requests, 1 participants]