hermes - ✅(Solved) Fix [Bug]: post_llm_call response overrides are applied after persistence, causing final_response/history mismatch [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14894Fetched 2026-04-24 10:44:29
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×1

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

def on_post_llm_call(**kwargs):
    return {"response": "patched response"}

- the user may see `"patched response"`
- `result["messages"]` still contains `"original response"`
- persisted history still contains `"original response"`
- the next turn uses `"original response"` as prior assistant context

PR fix notes

PR #14913: fix(agent): run post_llm_call hook before session persistence

Description (problem / solution / changelog)

Summary

Fixes #14894

The post_llm_call plugin hook was invoked after _persist_session(), so any response overrides applied by plugins were never captured in the persisted session history. This caused a mismatch between the final_response returned to the caller and the history stored in SQLite/JSON logs.

Root Cause

In run_conversation():

  1. _persist_session(messages, conversation_history) writes to DB
  2. post_llm_call hook fires — plugin can modify final_response

The hook result was never re-persisted.

Changes

run_agent.pyrun_conversation():

  • Moved _persist_session() to fire after the post_llm_call hook block
  • Added support for plugins to return {"override_response": "..."} dicts
  • When an override is returned, both final_response and the last assistant message are updated before persistence
  • Restored the turn-exit diagnostic log block in its original position

Testing

  • The hook block is guarded by try/except — plugin failures cannot break persistence
  • _persist_session() still runs unconditionally after the hook
  • Turn-exit diagnostic logging is preserved (no regression)
  • Existing test_plugins.py tests remain compatible (invoke_hook return value was previously ignored)

Checklist

  • Follows Conventional Commits format
  • Single logical change
  • No breaking changes
  • Backward compatible (plugins that don't return override_response work as before)

Changed files

  • run_agent.py (modified, +34/-21)

Code Example

def on_post_llm_call(**kwargs):
    return {"response": "patched response"}

ctx.register_hook("post_llm_call", on_post_llm_call)

---

original response

---

self._persist_session(messages, conversation_history)

_post_results = invoke_hook("post_llm_call", ...)
for r in _post_results:
    final_response = ...

---

Report     https://paste.rs/pK9p0
  agent.log  https://paste.rs/N7B6x

---
RAW_BUFFERClick to expand / collapse

Bug Description

post_llm_call hooks can return a replacement response, but the override is currently applied after session persistence.

This means the user-facing final_response can differ from the assistant message stored in result["messages"], the SQLite session DB, and the JSON session log. On the next turn or after resuming the session, Hermes replays the original model response instead of the hook-modified response.

This creates inconsistent behavior for plugins that use post_llm_call for response post-processing, rendering, policy transforms, or persona/style shaping.

Steps to Reproduce

  1. Register a post_llm_call hook that returns a replacement response:
def on_post_llm_call(**kwargs):
    return {"response": "patched response"}

ctx.register_hook("post_llm_call", on_post_llm_call)
  1. Run a normal conversation turn where the model produces a response, for example:
original response
  1. Observe the returned/displayed final_response.

  2. Inspect any of the following:

  • result["messages"]
  • the persisted session transcript
  • the SQLite session DB
  • the next turn's replayed conversation context
  • a resumed session

Expected Behavior

If post_llm_call supports response overrides, the overridden response should be applied consistently to the completed assistant turn.

The following should all agree:

  • returned result["final_response"]
  • last assistant message in result["messages"]
  • persisted session DB / JSON session log
  • next-turn conversation replay
  • resumed session transcript

Alternatively, if post_llm_call overrides are intended to be display-only, this should be documented explicitly to avoid plugin authors assuming durable response mutation.

Actual Behavior

The hook override affects the returned user-facing final_response, but does not update the already-persisted assistant message.

Current order is effectively:

self._persist_session(messages, conversation_history)

_post_results = invoke_hook("post_llm_call", ...)
for r in _post_results:
    final_response = ...

As a result:

  • the user may see "patched response"
  • result["messages"] still contains "original response"
  • persisted history still contains "original response"
  • the next turn uses "original response" as prior assistant context

Affected Component

CLI (interactive chat), Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

No response

Debug Report

Report     https://paste.rs/pK9p0
  agent.log  https://paste.rs/N7B6x

Operating System

ubuntu 24.04

Python Version

No response

Hermes Version

No response

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

extent analysis

TL;DR

Update the session persistence to occur after the post_llm_call hook has been invoked to ensure consistency between the user-facing response and the stored session data.

Guidance

  • Verify the current order of operations by checking the code that invokes the post_llm_call hook and persists the session data.
  • Consider updating the code to persist the session data after the post_llm_call hook has been invoked, ensuring that the overridden response is stored correctly.
  • Review the documentation for post_llm_call to ensure it clearly states whether the hook is intended for display-only overrides or durable response mutation.
  • Test the updated code to ensure that the final_response, result["messages"], and persisted session data are consistent.

Example

# Updated code to persist session data after post_llm_call hook
_post_results = invoke_hook("post_llm_call", ...)
for r in _post_results:
    final_response = ...
self._persist_session(messages, conversation_history)

Notes

The exact implementation details may vary depending on the specific codebase and requirements. It is essential to review the code and documentation carefully to ensure a correct and consistent fix.

Recommendation

Apply workaround: Update the session persistence to occur after the post_llm_call hook has been invoked, as this ensures consistency between the user-facing response and the stored session data.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Bug]: post_llm_call response overrides are applied after persistence, causing final_response/history mismatch [1 pull requests, 1 participants]