hermes - 💡(How to fix) Fix Windows gateway /restart can leave gateway stopped when helper inherits _HERMES_GATEWAY [3 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On Windows, triggering the gateway /restart flow can leave Hermes Gateway stopped instead of bringing up the replacement process.

The failure happens in the detached restart helper spawned from the running gateway. Because the helper inherits the parent gateway environment, it also inherits _HERMES_GATEWAY=1. The replacement command then runs hermes gateway restart with that marker still set, so the CLI's self-targeting guard treats the command as if it is being run from inside the live gateway and refuses to proceed.

Error Message

  1. A restart is requested from the running gateway.
  2. Gateway begins shutdown/restart handling.
  3. The detached helper inherits _HERMES_GATEWAY=1.
  4. The helper runs hermes gateway restart.
  5. The CLI restart guard sees _HERMES_GATEWAY=1 and refuses to restart the gateway.
  6. Gateway remains stopped until manually restarted.

Root Cause

The Windows detached restart watcher in gateway/run.py inherited the entire gateway environment unchanged. That included _HERMES_GATEWAY=1, which is intentionally used to prevent self-targeting gateway lifecycle operations from inside the running gateway.

In the detached helper case, however, the helper is no longer the gateway process; it is the intended replacement launcher. Keeping the marker causes a false positive in the guard.

Fix Action

Fixed

Code Example

PYTHONPATH="$(pwd)" uv run --with pytest --with pytest-asyncio --with pytest-xdist --with pyyaml pytest tests/gateway/test_restart_drain.py -q -o 'addopts='
..................                                                       [100%]
18 passed in 2.66s
RAW_BUFFERClick to expand / collapse

Summary

On Windows, triggering the gateway /restart flow can leave Hermes Gateway stopped instead of bringing up the replacement process.

The failure happens in the detached restart helper spawned from the running gateway. Because the helper inherits the parent gateway environment, it also inherits _HERMES_GATEWAY=1. The replacement command then runs hermes gateway restart with that marker still set, so the CLI's self-targeting guard treats the command as if it is being run from inside the live gateway and refuses to proceed.

Impact

  • A gateway /restart or update-triggered restart can interrupt all configured messaging platforms.
  • The old gateway exits, but the replacement process may not start.
  • From the user's perspective, the bot goes silent until the gateway is manually restarted.
  • This is especially risky on unattended Windows installs because the gateway is the control plane for remote recovery.

Environment

  • OS: Windows
  • Affected surface: Hermes Gateway restart flow, especially slash-command/update-driven restart from a running gateway process
  • Gateway platforms observed in the affected installation: QQBot and Weixin

Observed behavior

  1. A restart is requested from the running gateway.
  2. Gateway begins shutdown/restart handling.
  3. The detached helper inherits _HERMES_GATEWAY=1.
  4. The helper runs hermes gateway restart.
  5. The CLI restart guard sees _HERMES_GATEWAY=1 and refuses to restart the gateway.
  6. Gateway remains stopped until manually restarted.

Expected behavior

The detached restart helper should preserve normal runtime configuration such as HERMES_HOME, profile, PATH, virtualenv, and credentials, but it must not inherit the gateway self-targeting marker. The replacement command should be able to run hermes gateway restart from outside the old gateway process.

Root cause

The Windows detached restart watcher in gateway/run.py inherited the entire gateway environment unchanged. That included _HERMES_GATEWAY=1, which is intentionally used to prevent self-targeting gateway lifecycle operations from inside the running gateway.

In the detached helper case, however, the helper is no longer the gateway process; it is the intended replacement launcher. Keeping the marker causes a false positive in the guard.

Proposed fix

When building the Windows detached helper environment:

  • copy os.environ
  • remove only _HERMES_GATEWAY
  • pass that sanitized env to subprocess.Popen
  • keep stdout/stderr/stdin detached from the old console/process

This preserves required configuration while avoiding the false self-targeting guard.

Verification

Targeted regression test added for the Windows detached restart helper to assert that the generated watcher strips _HERMES_GATEWAY before invoking hermes gateway restart.

Local targeted test run:

PYTHONPATH="$(pwd)" uv run --with pytest --with pytest-asyncio --with pytest-xdist --with pyyaml pytest tests/gateway/test_restart_drain.py -q -o 'addopts='
..................                                                       [100%]
18 passed in 2.66s

Live recovery was also verified after manual restart on the affected Windows install: hermes gateway status reported a registered Hermes_Gateway task and a running gateway process, and QQBot messages were delivered again.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The detached restart helper should preserve normal runtime configuration such as HERMES_HOME, profile, PATH, virtualenv, and credentials, but it must not inherit the gateway self-targeting marker. The replacement command should be able to run hermes gateway restart from outside the old gateway process.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING