hermes - ✅(Solved) Fix MCP config watcher records failed reloads as applied and stops retrying [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14716Fetched 2026-04-24 06:15:03
View on GitHub
Comments
0
Participants
1
Timeline
7
Reactions
0
Participants
Timeline (top)
labeled ×5cross-referenced ×2

Fix Action

Fix / Workaround

Minimal reproduction

cd /Users/genie/.hermes/hermes-agent
source venv/bin/activate
python - <<'PY'
from types import MethodType
from pathlib import Path
from unittest.mock import patch
import tempfile
import cli

with tempfile.TemporaryDirectory() as td:
    cfg = Path(td) / "config.yaml"
    cfg.write_text("mcp_servers:\n  demo:\n    command: echo\n", encoding="utf-8")
    with patch("hermes_cli.config.get_config_path", return_value=cfg):
        obj._check_config_mcp_changes()
        print("after_first", obj._reload_calls, obj._config_mcp_servers)
        obj._last_config_check = 0
        obj._check_config_mcp_changes()
        print("after_second", obj._reload_calls, obj._config_mcp_servers)
PY

Observed output:

  • first check triggers one reload attempt and stores {'demo': {'command': 'echo'}}
  • second check does not retry (after_second 1 ...) even though the previous reload failed

PR fix notes

PR #14745: fix(cli): MCP config watcher retries after failed reload (#14716)

Description (problem / solution / changelog)

Problem

The MCP config watcher updated _config_mtime / _config_mcp_servers before _reload_mcp() succeeded, so a transient failure poisoned the watcher until the file changed again (see issue body).

Fix

  • Parse YAML first; only bump _config_mtime when mcp_servers is unchanged (other section edit) or after a successful reload.
  • On reload failure or 30s join timeout, roll _config_mtime back so the next check retries.
  • _reload_mcp() now returns bool; manual /reload-mcp prints a warning when reload fails.

Tests

  • Extended tests/cli/test_cli_mcp_config_watch.py with success/failure snapshot assertions.

Fixes #14716

Made with Cursor

Changed files

  • cli.py (modified, +29/-8)
  • tests/cli/test_cli_mcp_config_watch.py (modified, +33/-0)

PR #14855: fix(cli): retry MCP config reload after failed apply

Description (problem / solution / changelog)

Summary

  • only mark config watcher state as applied after MCP reload succeeds
  • keep failed or timed out reloads dirty so the next poll retries automatically
  • add a regression test covering a failed reload followed by a successful retry

Closes #14716

Testing

  • python3 -m pytest -o addopts= tests/cli/test_cli_mcp_config_watch.py
  • python3 -m pytest -o addopts= tests/tools/test_mcp_stability.py -k reload_timeout

Changed files

  • cli.py (modified, +20/-3)
  • tests/cli/test_cli_mcp_config_watch.py (modified, +24/-0)

Code Example

cd /Users/genie/.hermes/hermes-agent
source venv/bin/activate
python - <<'PY'
from types import MethodType
from pathlib import Path
from unittest.mock import patch
import tempfile
import cli

class Dummy: pass
obj = Dummy()
obj._config_mtime = 0
obj._config_mcp_servers = {}
obj._last_config_check = 0
obj._reload_calls = 0

def boom(self):
    self._reload_calls += 1
    raise RuntimeError("reload failed")

obj._reload_mcp = MethodType(boom, obj)
obj._check_config_mcp_changes = MethodType(cli.HermesCLI._check_config_mcp_changes, obj)

with tempfile.TemporaryDirectory() as td:
    cfg = Path(td) / "config.yaml"
    cfg.write_text("mcp_servers:\n  demo:\n    command: echo\n", encoding="utf-8")
    with patch("hermes_cli.config.get_config_path", return_value=cfg):
        obj._check_config_mcp_changes()
        print("after_first", obj._reload_calls, obj._config_mcp_servers)
        obj._last_config_check = 0
        obj._check_config_mcp_changes()
        print("after_second", obj._reload_calls, obj._config_mcp_servers)
PY
RAW_BUFFERClick to expand / collapse

Bug Description

The CLI MCP config watcher updates _config_mcp_servers before _reload_mcp() succeeds. If reload fails, the watcher believes the new config is already active and will not retry until the config changes again.

Affected files / lines

  • cli.py:6538-6550 — updates _config_mtime and _config_mcp_servers before reload success is known
  • cli.py:6556-6562 — reload runs in a thread and failures are not rolled back

Why this is a bug

A transient reload failure leaves runtime state stale while the watcher records the failed config as applied. Subsequent watcher passes see new_mcp == self._config_mcp_servers and skip reloading altogether.

Minimal reproduction

cd /Users/genie/.hermes/hermes-agent
source venv/bin/activate
python - <<'PY'
from types import MethodType
from pathlib import Path
from unittest.mock import patch
import tempfile
import cli

class Dummy: pass
obj = Dummy()
obj._config_mtime = 0
obj._config_mcp_servers = {}
obj._last_config_check = 0
obj._reload_calls = 0

def boom(self):
    self._reload_calls += 1
    raise RuntimeError("reload failed")

obj._reload_mcp = MethodType(boom, obj)
obj._check_config_mcp_changes = MethodType(cli.HermesCLI._check_config_mcp_changes, obj)

with tempfile.TemporaryDirectory() as td:
    cfg = Path(td) / "config.yaml"
    cfg.write_text("mcp_servers:\n  demo:\n    command: echo\n", encoding="utf-8")
    with patch("hermes_cli.config.get_config_path", return_value=cfg):
        obj._check_config_mcp_changes()
        print("after_first", obj._reload_calls, obj._config_mcp_servers)
        obj._last_config_check = 0
        obj._check_config_mcp_changes()
        print("after_second", obj._reload_calls, obj._config_mcp_servers)
PY

Observed output:

  • first check triggers one reload attempt and stores {'demo': {'command': 'echo'}}
  • second check does not retry (after_second 1 ...) even though the previous reload failed

Expected Behavior

Failed reloads should either roll back watcher state or be marked dirty so the next poll retries.

Actual Behavior

The watcher is effectively poisoned until the config file changes again.

Suggested investigation direction

Only commit _config_mcp_servers after successful reload, or preserve a separate “desired config” vs “applied config” state and retry while they differ.

extent analysis

TL;DR

Update the CLI MCP config watcher to commit _config_mcp_servers only after a successful reload.

Guidance

  • Review the code in cli.py:6538-6550 and cli.py:6556-6562 to understand how the watcher updates _config_mcp_servers and runs the reload in a thread.
  • Consider introducing a "desired config" vs "applied config" state to track the difference between the intended and actual configurations.
  • Modify the _check_config_mcp_changes method to only update _config_mcp_servers after a successful reload, or to mark the config as "dirty" if the reload fails.
  • Use the provided minimal reproduction code to test and verify the changes.

Example

def _check_config_mcp_changes(self):
    # ...
    try:
        self._reload_mcp()
        self._config_mcp_servers = new_mcp  # only update after successful reload
    except RuntimeError:
        # mark config as dirty or preserve "desired config" state
        pass

Notes

The suggested investigation direction provides two possible approaches: committing _config_mcp_servers after successful reload or preserving a separate "desired config" vs "applied config" state. The chosen approach will depend on the specific requirements and constraints of the system.

Recommendation

Apply a workaround by updating the _check_config_mcp_changes method to only update _config_mcp_servers after a successful reload, as this is a more straightforward and immediate fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING