hermes - ✅(Solved) Fix [Bug]: gateway session:end event not emitted from idle-expiry watcher or auto-reset path [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#28746Fetched 2026-05-20 04:02:19
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×3labeled ×3

Fix Action

Fixed

PR fix notes

PR #28750: fix(gateway): emit session:end from idle-expiry watcher + auto-reset paths

Description (problem / solution / changelog)

What does this PR do?

Fix the gateway session:end hook so it fires from every session-close path, not just /new//reset. Without this, external hook subscribers (under ~/.hermes/hooks/) silently miss every idle-expiry- and auto-reset-driven close — leaving stale mirror state forever.

The gateway emits the lifecycle event session:end from exactly one call site, the /new//reset handler. Two other paths close sessions but don't emit:

  1. Idle-expiry watcher (gateway/run.py:_session_expiry_watcher) — fires the plugin-level on_session_finalize (per #14981), evicts the cached agent, marks entry.expiry_finalized = True. But never self.hooks.emit("session:end", ...).

  2. Auto-reset branch (gateway/session.py:get_or_create_session) — when a stale entry is rolled over for a new session_id, the local SQLite session ends but no gateway event fires for the old session_id.

This PR makes session:end symmetric with session:start — every close path now emits, every subscriber sees the close.

Related Issue

Fixes #28746

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • gateway/run.py:_session_expiry_watcher — emit session:end with reason="idle_expiry" right after entry.expiry_finalized = True. Wrapped in try/except logger.debug(..., exc_info=True) so a misbehaving subscriber can't break the watcher loop. (platform is pulled from the session_key just like the existing on_session_finalize invocation; user_id from entry.origin with the same or "" guard pattern used elsewhere in this file.)

  • gateway/run.py:_handle_message_with_agent — in the _is_new_session branch, before emitting session:start, check getattr(session_entry, "auto_reset_prior_session_id", None). If set, emit session:end with reason="auto_reset", then clear the field so a follow-up turn cannot re-emit for the same close.

  • gateway/session.py — add a transient field auto_reset_prior_session_id: Optional[str] = None to SessionEntry. get_or_create_session populates it (with the prior session_id) whenever the auto-reset branch fires. Not added to to_dict() — this is a transient signal consumed by the next emit pass, never persisted to sessions.json.

  • gateway/hooks.py — docstring update to reflect the new firing contract: session:end now lists the three close paths and notes the reason key on the new emit paths.

  • tests/gateway/test_session_boundary_hooks.py — three new tests:

    • test_idle_expiry_emits_session_end — runs _session_expiry_watcher against a mocked-out store with one expired entry; asserts the emit fires with the expired session_id, session_key, and reason="idle_expiry". Regression guard for the watcher path.
    • test_auto_reset_emits_session_end_for_prior_session — exercises the emit fragment in _handle_message_with_agent against a SessionEntry that was just produced by auto-reset (with auto_reset_prior_session_id="sess-old-prior"); asserts session:end fires before session:start, with the right session_id + reason="auto_reset", and that the transient field is cleared after consume.
    • test_session_entry_has_auto_reset_prior_session_id_field — dataclass-level sanity check (default None, writable, not in to_dict()).

The existing 6 tests in this file (test_reset_*, test_finalize_before_reset, test_shutdown_*, test_hook_error_*, test_idle_expiry_fires_finalize_hook) all still pass — no existing call site changes behavior.

How to Test

Automated

pytest tests/gateway/test_session_boundary_hooks.py -v --override-ini="addopts="

All 9 tests pass (6 pre-existing + 3 new).

Manual reproduction

  1. Install a gateway hook in ~/.hermes/hooks/test-session-end/:

    # HOOK.yaml
    name: test-session-end
    events: [session:start, session:end]
    # handler.py
    import datetime, json, pathlib
    LOG = pathlib.Path.home() / "session_end_test.log"
    async def handle(event_type, context):
        LOG.parent.mkdir(parents=True, exist_ok=True)
        with LOG.open("a") as f:
            f.write(json.dumps({"event": event_type, "ts": datetime.datetime.now().isoformat(), "ctx": context}) + "\n")
  2. Start the gateway. Open a Telegram DM session. Send one message. Confirm session:start lands in ~/session_end_test.log.

  3. Stop messaging. Wait for the configured idle-reset window to pass plus one 5-min watcher tick.

  4. Before this fix: the log shows only session:start (no session:end). ~/.hermes/sessions/sessions.json shows expiry_finalized: true for the session.

  5. After this fix: the log shows session:end with reason: "idle_expiry", session_id, session_key, platform, user_id. sessions.json still shows expiry_finalized: true.

  6. Repeat for the auto-reset path: start a fresh session, idle past the reset window, then send a NEW message (not /new). The handler logs session:end for the OLD session_id (with reason: "auto_reset"), then session:start for the NEW one.

Note on Hermes version

Tested against the fork's main branch (currently aligned with hermes-agent 0.14.0 per pyproject.toml). The patched code paths exist identically in 0.13.0 and earlier — the line numbers in the original issue reference the v0.13.0 layout. The semantic anchors (function names, code shapes) are unchanged.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(gateway): ...)
  • I searched for existing PRs to make sure this isn't a duplicate (#24982 adds chat_id to the existing emit sites — different bug, cleanly compatible with this change)
  • My PR contains only changes related to this fix (no unrelated commits)
  • I've run pytest tests/gateway/test_session_boundary_hooks.py -v and all tests pass (9/9)
  • I've added tests for my changes (3 new tests in test_session_boundary_hooks.py)
  • I've tested on my platform: macOS 25.4 (Darwin 25.4.0)

Documentation & Housekeeping

  • I've updated relevant documentation — gateway/hooks.py Events docstring reflects the new firing contract for session:end
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — N/A (no platform-specific code paths touched)
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

Changed files

  • gateway/hooks.py (modified, +6/-1)
  • gateway/run.py (modified, +54/-0)
  • gateway/session.py (modified, +16/-0)
  • tests/gateway/test_session_boundary_hooks.py (modified, +192/-0)

PR #29048: feat: Typed Hook Payloads (Phase 2) + Path Parity Tests (Phase 3)

Description (problem / solution / changelog)

Summary

Implements Phase 2 (Typed Hook Payloads) and Phase 3 (Path Parity Tests) of FR #28984.

Phase 1 was submitted separately as PR #28995.


Phase 2 — Typed Hook Payloads

New file: hermes_cli/hook_payloads.py (276 lines)

Defines frozen dataclass payloads for all 14 hook types:

HookDataclass
pre_tool_callPreToolCallPayload
post_tool_callPostToolCallPayload
transform_tool_resultTransformToolResultPayload
on_session_startSessionStartPayload
on_session_endSessionEndPayload
on_session_finalizeSessionFinalizePayload
on_session_resetSessionResetPayload
pre_approval_requestPreApprovalRequestPayload
post_approval_responsePostApprovalResponsePayload
pre_llm_callPreLlmCallPayload
post_llm_callPostLlmCallPayload
pre_api_requestPreApiRequestPayload
post_api_requestPostApiRequestPayload
subagent_stopSubagentStopPayload

payload_to_kwargs() converts dataclass → plain dict, preserving cb(**kwargs) compatibility for all existing plugin callbacks.

5 call sites migrated (adds type safety, zero breaking changes):

FileHook
hermes_cli/plugins.pyget_pre_tool_call_block_message()
agent/conversation_loop.pyon_session_start, pre_llm_call
model_tools.pypost_tool_call, transform_tool_result

New tests: tests/hermes_cli/test_hook_payloads.py — 21 tests covering all payloads, frozen immutability, round-trip, registry completeness.


Phase 3 — Path Parity Tests

New file: hermes_cli/path_parity.py (155 lines)

assert_field_parity(
    "fallback_model",
    {
        "gateway_run_agent": lambda: extract_gateway_fields(),
        "tui_make_agent": lambda: extract_tui_fields(),
    },
)

When any path does not consume the field, raises PathParityError with a diff showing which paths are missing it.

3 known divergences documented as failing tests:

  • #28753: TUI doesn't pass fallback_model to AIAgent
  • #28746: idle-expiry path doesn't emit session:end event
  • #28637: /model switch loses per-model token usage

New tests: tests/hermes_cli/test_path_parity.py — 13 tests covering the helper functions and documented divergence patterns.


What This Prevents

Future scenarioWithout Phase 2With Phase 2
New param added to pre_tool_call2 of 3 call sites miss itDataclass forces all sites to provide it
Hook field renamedSilent breakage in pluginsFrozen dataclass catches at construction
Future scenarioWithout Phase 3With Phase 3
fallback_model added to gateway but forgotten in TUISilent divergence (#28753)assert_field_parity fails in CI
session:end forgotten in one exit pathBug hidden until user hits expiry path (#28746)Test documents expected parity

Test Results

tests/hermes_cli/test_hook_payloads.py  21 passed
tests/hermes_cli/test_path_parity.py   13 passed
tests/hermes_cli/test_plugins.py       105 passed
tests/test_model_tools.py               passed
tests/test_transform_tool_result_hook.py passed
tests/agent/test_plugin_llm.py          passed
tests/agent/test_shell_hooks.py         passed
─────────────────────────────────────────────
Total                                     139 passed

Refs: FR #28984 | PR #28995 (Phase 1)

Changed files

  • .dev-workflow/code-graph.db (added, +0/-0)
  • agent/conversation_loop.py (modified, +6/-4)
  • hermes_cli/hook_payloads.py (added, +276/-0)
  • hermes_cli/path_parity.py (added, +155/-0)
  • hermes_cli/plugins.py (modified, +7/-2)
  • hook_system_analysis.md (added, +348/-0)
  • model_tools.py (modified, +6/-4)
  • plans/path_parity_analysis.md (added, +135/-0)
  • tests/hermes_cli/test_hook_payloads.py (added, +245/-0)
  • tests/hermes_cli/test_path_parity.py (added, +198/-0)
RAW_BUFFERClick to expand / collapse

Bug Description

The gateway-level session:end hook event is emitted from exactly one call site (gateway/run.py:7667, inside _handle_reset_command). Two other paths that terminate a gateway session both close the session locally and fire the plugin-level on_session_finalize hook, but neither emits the gateway-level session:end event:

  1. Idle-expiry watchergateway/run.py:_session_expiry_watcher (line 3666). Runs every 5 min by default. Fires on_session_finalize plugin hook (#14981 fix), evicts the cached agent, sets entry.expiry_finalized = True. Does not emit gateway session:end.
  2. Auto-reset branchgateway/session.py:get_or_create_session (lines 893-905, where _should_reset is consulted and a stale session is closed before a new one is created for the same session_key). Calls self._db.end_session(...) against local SQLite. Does not emit gateway session:end.

So session:end only fires for explicit /new / /reset. Any session closed by idle-expiry, suspended-state reset, or daily-reset-policy turnover silently disappears from the perspective of any gateway hook subscriber.

Documentation states (gateway/hooks.py line 12):

session:end -- Session ends (user ran /new or /reset)

…which appears to document the bug as intentional, but in practice external observers expect session:end to fire whenever a session ends — symmetric with session:start.

Why this matters / impact

Any hook in ~/.hermes/hooks/ that subscribes to session:end (intended to: close mirror rows in an external DB, finalize logging, persist a transcript checkpoint, notify a remote observer, etc.) silently misses every idle-expiry- and auto-reset-driven close. The hook keeps cached state for those sessions forever.

In the specific install this was caught in, an external substrate-ingest hook subscribing to session:start / agent:start / agent:end / session:end produced orphan running rows in a downstream session-tracking DB: 3 cases confirmed across 3 days of operation, all on the idle-expiry path (sessions.json shows expiry_finalized: true, while the hook's local state.json still holds the substrate row id — indicating session:end never fired). The substrate-side cleanup needed an out-of-band SQL UPDATE.

The asymmetry is also confusing for hook authors: per the doc on on_session_finalize (plugin-level), on_session_finalize correctly fires on idle-expiry (PR #1725 / issue #14981). But the gateway-level session:end event — documented as the gateway-side counterpart — does not. Hook authors who subscribe only to session:end will be surprised.

Reproduction steps

  1. Install a gateway-level hook in ~/.hermes/hooks/<name>/HOOK.yaml subscribing to session:start + session:end. Handler logs both events with the session_id.
  2. Start a gateway session (e.g., DM the agent on Telegram).
  3. Path A — idle-expiry: stop messaging. Wait for the configured reset window (default ~30 min for gateway.session.reset_after) PLUS one watcher tick (5 min). Observe: the watcher loop logs Session expiry finalized for <session_id>. ~/.hermes/sessions/sessions.json shows expiry_finalized: true for the session_key. The hook log shows the session:start from earlier but no session:end.
  4. Path B — auto-reset: start a session, let it idle past the reset window, then send a new message that triggers auto-reset (different from /new). The new turn produces a new session_id for the same session_key. The hook log shows a new session:start for the new session_id but no session:end for the OLD session_id.

Expected behavior

session:end fires whenever a session ends, regardless of which path closed it. Specifically:

  • Idle-expiry watcher: after entry.expiry_finalized = True is set, emit session:end with the now-closed session_id.
  • Auto-reset in get_or_create_session: before (or as part of) emitting session:start for the new session_id, emit session:end for the OLD session_id that was just closed.

This makes session:end symmetric with session:start and unblocks external observers from tracking session lifecycle correctly.

Actual behavior

session:end is silent on both paths. External observers see open-ended session:start events that never close.

Affected code

  • gateway/run.py ≈ line 3666 — _session_expiry_watcher (loop body around line 3706-3743)
  • gateway/run.py ≈ line 6525-6541 — _handle_message_with_agent, the _is_new_session branch (where session:start fires for auto-reset; this is also where session:end for the OLD session_id should fire on auto-reset)
  • gateway/session.py ≈ line 893-905, 914-926 — get_or_create_session auto-reset branch + SessionEntry construction (needs to plumb the prior session_id so the caller in run.py knows what to emit)
  • gateway/hooks.py line 12 — docstring update to reflect the new contract (session:end fires on all session ends, not just /new / /reset)

Hermes version observed

hermes-agent 0.13.0 (per ~/.hermes/hermes-agent/pyproject.toml).

Related history

  • #14981 — fixed the plugin-level on_session_finalize not firing on idle expiry. Same code path, different event.
  • #1725 / #1432 / #1415 — original introduction of the session:start / session:end lifecycle events.
  • #19831 — a separate OpenViking-specific edge case on the same idle-expiry path. Distinct from this bug.
  • #24982 — adds chat_id to the existing emit sites. Cleanly compatible with this change (different field; the new emit sites can include chat_id for consistency once #24982 lands or vice versa).
  • #12176 — adjacent: on_session_finalize fires with session_id=None when no prior session exists on /new. Distinct bug.

Suggested fix (sketch)

Three small edits, all additive (no existing call sites change behavior; the explicit /new path remains as-is):

  1. gateway/run.py:_session_expiry_watcher — after entry.expiry_finalized = True (line 3742), emit session:end with session_id, session_key, derived platform, and reason: "idle_expiry". Wrap in try/except so a misbehaving subscriber can't break the watcher (mirrors the existing try/except pass around the on_session_finalize plugin invoke at line 3712).
  2. gateway/session.py:get_or_create_session — when the auto-reset branch fires (line 901, was_auto_reset = True), capture entry.session_id (already done as db_end_session_id) and store it on the new SessionEntry as a transient field auto_reset_prior_session_id. Add this field to the SessionEntry dataclass (after auto_reset_reason at line 459). No persistence needed — it's consumed once by the next caller.
  3. gateway/run.py:_handle_message_with_agent — in the _is_new_session branch (line 6535), before emitting session:start, check getattr(session_entry, "auto_reset_prior_session_id", None) — if set, emit session:end for that prior session_id with reason: "auto_reset", then clear the field to make it idempotent.

The emit payload shape should match the existing session:end shape at line 7667 (platform, user_id, session_key), plus session_id so subscribers can disambiguate close events for the same session_key over time. Also recommend adding reason so subscribers can distinguish idle_expiry from auto_reset from manual_reset (the existing /new emit could add "reason": "manual_reset" for symmetry).

PR

Happy to submit a PR — opening one in parallel with this issue.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

session:end fires whenever a session ends, regardless of which path closed it. Specifically:

  • Idle-expiry watcher: after entry.expiry_finalized = True is set, emit session:end with the now-closed session_id.
  • Auto-reset in get_or_create_session: before (or as part of) emitting session:start for the new session_id, emit session:end for the OLD session_id that was just closed.

This makes session:end symmetric with session:start and unblocks external observers from tracking session lifecycle correctly.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Bug]: gateway session:end event not emitted from idle-expiry watcher or auto-reset path [3 pull requests, 1 participants]