hermes - ✅(Solved) Fix [Bug]: post_llm_call response overrides are applied after persistence, causing final_response/history mismatch [1 pull requests, 1 participants]

hermes2026-04-24 03:43:38

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#14894•Fetched 2026-04-24 10:44:29

View on GitHub

Comments

Participants

Timeline

Reactions

Author

M1p0

Participants

M1p0

Timeline (top)

labeled ×4cross-referenced ×1

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

def on_post_llm_call(**kwargs):
    return {"response": "patched response"}

- the user may see `"patched response"`
- `result["messages"]` still contains `"original response"`
- persisted history still contains `"original response"`
- the next turn uses `"original response"` as prior assistant context

PR fix notes

PR #14913: fix(agent): run post_llm_call hook before session persistence

Repository: NousResearch/hermes-agent
Author: aniruddhaadak80
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/14913

Description (problem / solution / changelog)

Summary

Fixes #14894

The post_llm_call plugin hook was invoked after _persist_session(), so any response overrides applied by plugins were never captured in the persisted session history. This caused a mismatch between the final_response returned to the caller and the history stored in SQLite/JSON logs.

Root Cause

In run_conversation():

_persist_session(messages, conversation_history) writes to DB
post_llm_call hook fires — plugin can modify final_response

The hook result was never re-persisted.

Changes

run_agent.py — run_conversation():

Moved _persist_session() to fire after the post_llm_call hook block
Added support for plugins to return {"override_response": "..."} dicts
When an override is returned, both final_response and the last assistant message are updated before persistence
Restored the turn-exit diagnostic log block in its original position

Testing

The hook block is guarded by try/except — plugin failures cannot break persistence
_persist_session() still runs unconditionally after the hook
Turn-exit diagnostic logging is preserved (no regression)
Existing test_plugins.py tests remain compatible (invoke_hook return value was previously ignored)

Checklist

Follows Conventional Commits format
Single logical change
No breaking changes
Backward compatible (plugins that don't return override_response work as before)

Changed files

run_agent.py (modified, +34/-21)

Code Example

def on_post_llm_call(**kwargs):
    return {"response": "patched response"}

ctx.register_hook("post_llm_call", on_post_llm_call)

---

original response

---

self._persist_session(messages, conversation_history)

_post_results = invoke_hook("post_llm_call", ...)
for r in _post_results:
    final_response = ...

---

Report     https://paste.rs/pK9p0
  agent.log  https://paste.rs/N7B6x

---

RAW_BUFFERClick to expand / collapse

Bug Description

post_llm_call hooks can return a replacement response, but the override is currently applied after session persistence.

This means the user-facing final_response can differ from the assistant message stored in result["messages"], the SQLite session DB, and the JSON session log. On the next turn or after resuming the session, Hermes replays the original model response instead of the hook-modified response.

This creates inconsistent behavior for plugins that use post_llm_call for response post-processing, rendering, policy transforms, or persona/style shaping.

Steps to Reproduce

def on_post_llm_call(**kwargs):
    return {"response": "patched response"}

ctx.register_hook("post_llm_call", on_post_llm_call)

Run a normal conversation turn where the model produces a response, for example:

original response

Observe the returned/displayed final_response.
Inspect any of the following:

result["messages"]
the persisted session transcript
the SQLite session DB
the next turn's replayed conversation context
a resumed session

Expected Behavior

If post_llm_call supports response overrides, the overridden response should be applied consistently to the completed assistant turn.

The following should all agree:

returned result["final_response"]
last assistant message in result["messages"]
persisted session DB / JSON session log
next-turn conversation replay
resumed session transcript

Alternatively, if post_llm_call overrides are intended to be display-only, this should be documented explicitly to avoid plugin authors assuming durable response mutation.

Actual Behavior

The hook override affects the returned user-facing final_response, but does not update the already-persisted assistant message.

Current order is effectively:

self._persist_session(messages, conversation_history)

_post_results = invoke_hook("post_llm_call", ...)
for r in _post_results:
    final_response = ...

As a result:

the user may see "patched response"
result["messages"] still contains "original response"
persisted history still contains "original response"
the next turn uses "original response" as prior assistant context

Affected Component

CLI (interactive chat), Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

No response

Debug Report

Report     https://paste.rs/pK9p0
  agent.log  https://paste.rs/N7B6x

Operating System

ubuntu 24.04

Python Version

No response

Hermes Version

No response

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

extent analysis

TL;DR

Update the session persistence to occur after the post_llm_call hook has been invoked to ensure consistency between the user-facing response and the stored session data.

Guidance

Verify the current order of operations by checking the code that invokes the post_llm_call hook and persists the session data.
Consider updating the code to persist the session data after the post_llm_call hook has been invoked, ensuring that the overridden response is stored correctly.
Review the documentation for post_llm_call to ensure it clearly states whether the hook is intended for display-only overrides or durable response mutation.
Test the updated code to ensure that the final_response, result["messages"], and persisted session data are consistent.

Example

# Updated code to persist session data after post_llm_call hook
_post_results = invoke_hook("post_llm_call", ...)
for r in _post_results:
    final_response = ...
self._persist_session(messages, conversation_history)

Notes

The exact implementation details may vary depending on the specific codebase and requirements. It is essential to review the code and documentation carefully to ensure a correct and consistent fix.

Recommendation

Apply workaround: Update the session persistence to occur after the post_llm_call hook has been invoked, as this ensures consistency between the user-facing response and the stored session data.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #generation error #database connection #vector store #embedding generation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix [Bug]: post_llm_call response overrides are applied after persistence, causing final_response/history mismatch [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

PR fix notes

PR #14913: fix(agent): run post_llm_call hook before session persistence

Description (problem / solution / changelog)

Summary

Root Cause

Changes

Testing

Checklist

Changed files

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Are you willing to submit a PR for this?

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING