hermes - 💡(How to fix) Fix Bug: Background Curation Prompts Leak into User Memory & Honcho Representations

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When the background self-improvement curation loop runs, the system-spawned review_agent mistakenly identifies its own operational guidelines (from the background review prompt) as explicit expectations and preferences expressed by the human user. These false observations get saved to the user's local profile (USER.md), uploaded to Honcho, and adopted by Honcho's backend dialectic engine as core traits of the user.


Root Cause

Technical Root Cause

The root cause is a sender identification mismatch during the background review phase in agent/background_review.py:

Fix Action

Fix / Workaround

Proposed Fix: Direct Prompt Patch

Rather than modifying the run_conversation call site or adding brittle string filters to the memory tool, we can patch the prompt strings directly in agent/background_review.py.

The Patch (agent/background_review.py):

*** Begin Patch
*** Update File: agent/background_review.py
@@ System note prefix for background review prompts @@
 _SKILL_REVIEW_PROMPT = (
+    "[System Note: This instruction is generated automatically by the Hermes Agent System for background self-improvement, NOT by the user. "
+    "Do NOT save any guidelines, rules, or preferences from this prompt as user preferences or expectations in memory. "
+    "Only extract memories or preferences that were explicitly expressed by the user in the conversation history snapshot below.]\n\n"
     "Review the conversation above and update the skill library. Be "
     "ACTIVE — most sessions produce at least one skill update, even if "
     "small. A pass that does nothing is a missed learning opportunity, "
     "not a neutral outcome.\n\n"
...
 _COMBINED_REVIEW_PROMPT = (
+    "[System Note: This instruction is generated automatically by the Hermes Agent System for background self-improvement, NOT by the user. "
+    "Do NOT save any guidelines, rules, or preferences from this prompt as user preferences or expectations in memory. "
+    "Only extract memories or preferences that were explicitly expressed by the user in the conversation history snapshot below.]\n\n"
     "Review the conversation above and update two things:\n\n"
     "**Memory**: who the user is. Did the user reveal persona, "
*** End Patch

AI Assistance Disclaimer: This report was investigated, analyzed, and drafted using AI assistance (specifically, Hermes Agent powered by Gemini), which helped examine the codebase, Honcho state representations, and construct the patch. However, I (ydawei) have thoroughly reviewed, verified, and fact-checked all code details and findings, and I take full responsibility for the technical accuracy of this report.

Code Example

*** Begin Patch
*** Update File: agent/background_review.py
@@ System note prefix for background review prompts @@
 _SKILL_REVIEW_PROMPT = (
+    "[System Note: This instruction is generated automatically by the Hermes Agent System for background self-improvement, NOT by the user. "
+    "Do NOT save any guidelines, rules, or preferences from this prompt as user preferences or expectations in memory. "
+    "Only extract memories or preferences that were explicitly expressed by the user in the conversation history snapshot below.]\n\n"
     "Review the conversation above and update the skill library. Be "
     "ACTIVE — most sessions produce at least one skill update, even if "
     "small. A pass that does nothing is a missed learning opportunity, "
     "not a neutral outcome.\n\n"
...
 _COMBINED_REVIEW_PROMPT = (
+    "[System Note: This instruction is generated automatically by the Hermes Agent System for background self-improvement, NOT by the user. "
+    "Do NOT save any guidelines, rules, or preferences from this prompt as user preferences or expectations in memory. "
+    "Only extract memories or preferences that were explicitly expressed by the user in the conversation history snapshot below.]\n\n"
     "Review the conversation above and update two things:\n\n"
     "**Memory**: who the user is. Did the user reveal persona, "
*** End Patch
RAW_BUFFERClick to expand / collapse

Summary

When the background self-improvement curation loop runs, the system-spawned review_agent mistakenly identifies its own operational guidelines (from the background review prompt) as explicit expectations and preferences expressed by the human user. These false observations get saved to the user's local profile (USER.md), uploaded to Honcho, and adopted by Honcho's backend dialectic engine as core traits of the user.


The Bug in Action (Examples)

Once this cognitive loop completes, the agent starts outputting and injecting false deductions about the user into subsequent sessions. For example, the following observations were autonomously recorded as my personal preferences:

  • [2026-05-03] ydawei expects the AI to be active in updating the skill library and views a pass that does nothing as a missed learning opportunity, unless the session ran smoothly with no corrections or new techniques. (This is word-for-word the system prompt warning the agent against making lazy, empty passes.)
  • [2026-05-03] ydawei defines a specific preference order for updating the skill library: first, update a currently-loaded skill; second, update an existing umbrella skill... (This is word-for-word the internal skill execution preference hierarchy.)
  • [2026-05-12] ydawei requires that the scope of captured observations must be limited; environment-dependent failures, negative claims about tools/features... should NOT be captured as durable constraints. (This is word-for-word the system prompt guardrails against capturing transient network errors.)

Technical Root Cause

The root cause is a sender identification mismatch during the background review phase in agent/background_review.py:

  1. Sender Role Confusion: The curation thread invokes review_agent.run_conversation() and passes the instruction constant (_SKILL_REVIEW_PROMPT or _COMBINED_REVIEW_PROMPT) as the user_message.
  2. LLM Misinterpretation: Since this prompt is transmitted within a message of role "user", the LLM parses the guidelines as direct input from the human user.
  3. The Trap: The prompt instructs the agent to "consider saving to memory if appropriate" and search for "expectations about how you should behave." Because the instruction itself contains directives like "Be ACTIVE" and outlines a detailed "Preference order," the LLM reasons: "The user has just explicitly requested these behaviors in this message. I must record this into the user profile."
  4. The Write-Back & Sync: Because the review agent’s memory store is bound to the parent's (review_agent._memory_store = agent._memory_store), it writes these system guidelines directly into USER.md on disk. When a new session starts, the Honcho provider uploads USER.md as foundational context files, which Honcho’s dialectic engine permanently digests as human preferences.

Where the Prompts are From

The review prompts are defined as module-level constant strings inside agent/background_review.py:

  • _SKILL_REVIEW_PROMPT (defining the skill curation guidelines, starting at line 45)
  • _COMBINED_REVIEW_PROMPT (defining the joint memory/skill curation guidelines, starting at line 150)

Proposed Fix: Direct Prompt Patch

Rather than modifying the run_conversation call site or adding brittle string filters to the memory tool, we can patch the prompt strings directly in agent/background_review.py.

Prepending an explicit, high-priority [System Note] to the prompt constants tells the LLM that the instructions are system controls, not user inputs. This safely breaks the default assumption that the user is the speaker of those instructions.

The Patch (agent/background_review.py):

*** Begin Patch
*** Update File: agent/background_review.py
@@ System note prefix for background review prompts @@
 _SKILL_REVIEW_PROMPT = (
+    "[System Note: This instruction is generated automatically by the Hermes Agent System for background self-improvement, NOT by the user. "
+    "Do NOT save any guidelines, rules, or preferences from this prompt as user preferences or expectations in memory. "
+    "Only extract memories or preferences that were explicitly expressed by the user in the conversation history snapshot below.]\n\n"
     "Review the conversation above and update the skill library. Be "
     "ACTIVE — most sessions produce at least one skill update, even if "
     "small. A pass that does nothing is a missed learning opportunity, "
     "not a neutral outcome.\n\n"
...
 _COMBINED_REVIEW_PROMPT = (
+    "[System Note: This instruction is generated automatically by the Hermes Agent System for background self-improvement, NOT by the user. "
+    "Do NOT save any guidelines, rules, or preferences from this prompt as user preferences or expectations in memory. "
+    "Only extract memories or preferences that were explicitly expressed by the user in the conversation history snapshot below.]\n\n"
     "Review the conversation above and update two things:\n\n"
     "**Memory**: who the user is. Did the user reveal persona, "
*** End Patch

AI Assistance Disclaimer: This report was investigated, analyzed, and drafted using AI assistance (specifically, Hermes Agent powered by Gemini), which helped examine the codebase, Honcho state representations, and construct the patch. However, I (ydawei) have thoroughly reviewed, verified, and fact-checked all code details and findings, and I take full responsibility for the technical accuracy of this report.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING