hermes - ✅(Solved) Fix [Bug]: Gateway idle expiry can finalize OpenViking sessions without commit when cached AIAgent is unavailable [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#19831Fetched 2026-05-05 06:04:50
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×1

Error Message

After gateway idle expiry, some OpenViking sessions can be left in this state:

Root Cause

OpenViking per-turn sync can succeed while long-term memory extraction silently fails. That is particularly confusing because message_count > 0 makes the session look stored, but the useful memory artifacts are never created because commit never ran.

This causes real memory loss for Telegram/gateway sessions that expire while idle or after process/cache lifecycle changes.

Fix Action

Fix / Workaround

  • #14981 fixed on_session_finalize hook dispatch on idle expiry.
  • #15165 fixed shutdown paths passing empty messages to shutdown_memory_provider().
  • #7759 fixed /new and /reset not triggering OpenViking commit.

PR fix notes

PR #19844: fix(gateway): OpenViking fallback commit on idle expiry when agent is evicted

Description (problem / solution / changelog)

Summary

When the session expiry watcher fires but the AIAgent has already been evicted from the cache, the normal cleanup path (shutdown_memory_provideron_session_end) cannot run. If OpenViking is configured and turns were synced during the session, this leaves the OpenViking session uncommitted — memories are never extracted.

Problem

The idle-expiry watcher in gateway/run.py looks up the cached/running agent to call _cleanup_agent_resources(). When the agent is gone (evicted by cache cap enforcement, process lifecycle, etc.), the cleanup is silently skipped. For OpenViking users, this means:

  • message_count > 0, pending_tokens > 0
  • commit_count == 0, memories_extracted.total == 0
  • Session marked expiry_finalized=True despite never being committed

Fix

Add _openviking_fallback_commit() to GatewayRunner that directly calls the OpenViking /api/v1/sessions/{session_id}/commit endpoint using environment variable configuration when no cached agent exists. The method is a no-op when OPENVIKING_ENDPOINT is not set.

In _session_expiry_watcher, the fallback fires in the elif branch where _cached_agent is None or the pending sentinel.

Scope

  • gateway/run.py: new _openviking_fallback_commit() method + call site in expiry watcher
  • tests/gateway/test_session_boundary_hooks.py: 2 new regression tests

Testing

pytest tests/gateway/test_session_boundary_hooks.py -v
# 8 passed (6 existing + 2 new)

Closes #19831

Changed files

  • gateway/run.py (modified, +43/-0)
  • tests/gateway/test_session_boundary_hooks.py (modified, +151/-0)

Code Example

message_count > 0
pending_tokens > 0
commit_count == 0
memories_extracted.total == 0

---

client.post(f"/api/v1/sessions/{session_id}/commit")
RAW_BUFFERClick to expand / collapse

Bug description

Gateway idle session expiry can mark a session as finalized even though an OpenViking-backed memory session was never committed. This leaves OpenViking with synced turn messages but no session commit/extraction, so memories are never indexed and the gateway will not retry because the local session is already marked finalized.

This looks like a follow-up edge case to #14981 rather than the same bug. #14981 fixed firing on_session_finalize on idle expiry, but this case concerns OpenViking MemoryProvider.on_session_end() / session commit when the cached AIAgent or provider instance is unavailable by the time the expiry watcher runs.

Related but distinct:

  • #14981 fixed on_session_finalize hook dispatch on idle expiry.
  • #15165 fixed shutdown paths passing empty messages to shutdown_memory_provider().
  • #7759 fixed /new and /reset not triggering OpenViking commit.

Observed behavior

After gateway idle expiry, some OpenViking sessions can be left in this state:

message_count > 0
pending_tokens > 0
commit_count == 0
memories_extracted.total == 0

At the same time, the gateway logs/session store indicate the idle expiry sweep completed/finalized the session. Because the local session is finalized, the watcher does not retry and OpenViking extraction never happens.

In a real local audit, several root Hermes OpenViking sessions had synced messages but commit_count == 0 and memories_extracted.total == 0; manually running ov session commit <session_id> repaired them and produced extracted memories.

Expected behavior

For an expired gateway session using OpenViking:

  1. If a cached/running AIAgent exists, the existing cleanup path should call shutdown_memory_provider() / MemoryProvider.on_session_end() and commit the OpenViking session.
  2. If no cached/running AIAgent exists but OpenViking has been configured and turns may already have been synced, the gateway should still commit the OpenViking session directly by session_id, or otherwise leave the local session unfinalized so a later retry/repair path can handle it.
  3. The session should only be marked expiry_finalized=True after the OpenViking commit/finalization step succeeds.

Actual behavior

The idle expiry watcher can reach a no-agent path. If the cached/running AIAgent is gone, _cleanup_agent_resources(agent) is not called, so the OpenViking provider's on_session_end() path does not run. The session can still be marked finalized locally, leaving the synced OpenViking session uncommitted forever.

Why this matters

OpenViking per-turn sync can succeed while long-term memory extraction silently fails. That is particularly confusing because message_count > 0 makes the session look stored, but the useful memory artifacts are never created because commit never ran.

This causes real memory loss for Telegram/gateway sessions that expire while idle or after process/cache lifecycle changes.

Suggested minimal fix

In gateway/run.py, inside _session_expiry_watcher:

  • Preserve the existing behavior when a cached/running agent exists.
  • If no cached/running agent exists and OpenViking is configured, perform a tiny provider-specific fallback commit:
client.post(f"/api/v1/sessions/{session_id}/commit")

using the configured OpenViking endpoint/account/user/agent.

  • Only set entry.expiry_finalized = True if cleanup or fallback commit succeeds.
  • If fallback commit fails, do not save the session as finalized, so the next watcher sweep can retry.

Regression tests to add

Add coverage in tests/gateway/test_session_boundary_hooks.py:

  1. Expired gateway session, OpenViking configured, no cached/running AIAgent, fallback commit succeeds:

    • fallback commit called once with the expired session id
    • expiry_finalized becomes True
  2. Same setup but fallback commit fails:

    • expiry_finalized remains False
    • session store is not saved as finalized

Notes

This is not meant to replace the generic lifecycle fixes from #14981 or #15165. It is a defensive OpenViking-specific fallback for orphaned gateway sessions where turns were already synced but the provider object is gone before idle-expiry finalization runs.

extent analysis

TL;DR

The most likely fix is to modify the _session_expiry_watcher in gateway/run.py to perform a fallback commit for OpenViking sessions when no cached/running agent exists.

Guidance

  • Check if the AIAgent instance is available before attempting to commit the OpenViking session.
  • If the AIAgent instance is not available, perform a fallback commit using the OpenViking endpoint and only mark the session as finalized if the commit succeeds.
  • Add regression tests to cover the scenarios where the fallback commit succeeds or fails.
  • Verify that the expiry_finalized flag is set correctly based on the outcome of the fallback commit.

Example

if agent:
    # Existing cleanup path
else:
    # Fallback commit for OpenViking sessions
    client.post(f"/api/v1/sessions/{session_id}/commit")
    # Only set expiry_finalized to True if commit succeeds

Notes

This fix is specific to OpenViking sessions and is intended to address the issue of orphaned sessions where turns were synced but the provider object is gone before idle-expiry finalization runs.

Recommendation

Apply the suggested minimal fix in gateway/run.py to perform a fallback commit for OpenViking sessions when no cached/running agent exists, as this will ensure that the session is properly committed and marked as finalized.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

For an expired gateway session using OpenViking:

  1. If a cached/running AIAgent exists, the existing cleanup path should call shutdown_memory_provider() / MemoryProvider.on_session_end() and commit the OpenViking session.
  2. If no cached/running AIAgent exists but OpenViking has been configured and turns may already have been synced, the gateway should still commit the OpenViking session directly by session_id, or otherwise leave the local session unfinalized so a later retry/repair path can handle it.
  3. The session should only be marked expiry_finalized=True after the OpenViking commit/finalization step succeeds.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Bug]: Gateway idle expiry can finalize OpenViking sessions without commit when cached AIAgent is unavailable [1 pull requests, 1 participants]