hermes - 💡(How to fix) Fix Memory pollution: irrelevant memories injected into system prompt cause model misunderstanding [1 pull requests]

StepCodex · 2026-05-09T13:06:43Z

[hermes] Bug Description Hermes' self-learning memory mechanism causes prompt pollution . When irrelevant memories are injected into the system prompt, the mod… ## Fixed - Fixed by PR: fix(memory): tighten MEMORY_GUIDANCE against ephemeral PR/issue/SHA notes (https://github.com/NousResearch/hermes-agent/pull/22781) ## Bug Description Hermes' self-learning memory mechanism causes **prompt pollution**. When irrelevant memories are injected into the system prompt, the model misunderstands the user's intent. **Update (2026-05-09)**: Some memories were incorrectly saved (e.g., `_priority_key()` was wrongly recorded as a "bug" when it was actually a new feature). This shows the memory system's tendency to save incorrect information. ## Steps to Reproduce 1. Use Hermes with `memory_enabled: true` in `~/.hermes/config.yaml` 2. Complete tasks related to **Project A** (e.g., CocoIndex POC, bug fixes, PR submissions) 3. Hermes saves detailed memories via `memory` tool (9+ §-separated sections in system prompt) 4. Start a **new session** with an unrelated query: **"测试模型访问"** (meaning "test model access") 5. Check the system prompt — it contains **all 9 memory sections**, including CocoIndex details 6. Model reasoning: *"The user wants to test the CocoIndex code search tools"* ❌ 7. Model calls `mcp_cocoindex_list_projects` and `mcp_cocoindex_search_code` tools ❌ 8. Final response: *"CocoIndex MCP 工具正常工作..."* (meaning "CocoIndex MCP tools working normally...") ❌ **Expected behavior**: Model should respond to "测试模型访问" with a simple model access test, not CocoIndex tool testing. ## Environment - **Hermes version**: v0.13.0 (2026.5.7) - **Python version**: 3.12.8 (system), 3.11.14 (Hermes uses) - **OS**: macOS Darwin 25.3.0 (arm64) - **Model used**: MiniMax-M2.7 - **Base URL**: https://api.minimaxi.com/v1 ## Local Modifications **YES** — `run_agent.py` has local modifications: ```diff # Added _priority_key() function for semantic search tool ordering # (Lines 3396-10075 in local version) + _SEMANTIC_SEARCH_PRIORITY: dict[str, int] = { + "mcp_codeindex_search_code": 0, + "codeindex_search": 0, + "search_code": 1, + } + + def _priority_key(item: tuple) -> tuple[int, int]: + ... ``` **Note**: This is a **feature**, not a bug. But the model incorrectly saved it as a "bug" in memory. Also modified: `README.md` **Branch**: main (125 commits behind origin/main) ## Session Information - **Session ID**: `20260509_202227_358459` - **Session file**: `~/.hermes/sessions/session_20260509_202227_358459.json` (71.1 KB) ## System Prompt Analysis Extracted from the session file: ``` System prompt length: 32,490 characters System prompt lines: 468 lines Memory sections (§-separated): 9 sections ``` ### Memory Sections Injected (ALL irrelevant to current query) 1. **Section 1**: CocoIndex POC at `/Users/zhuangjs/cocoindex-poc/` — pipeline_poc.py details 2. **Section 2**: CocoIndex POC Phase 1 done — 2,865 files indexed, 59,863 chunks 3. **Section 3**: incremental-code-indexing skill updated to v1.1.0 4. **Section 4**: ~~Bug: `_priority_key()` in `_execute_tool_calls_concurrent` unpacks wrong structure~~ **(INCORRECT — this is a feature, not a bug)** 5. **Section 5**: ~~Bug fix submitted as PR/issue #21937~~ **(INCORRECT)** 6. **Section 6**: ~~PR #21951 submitted to NousResearch/hermes-agent~~ **(INCORRECT)** 7. **Section 7**: Documents project notes in Obsidian vault 8. **Section 8**: Prefers lightweight (~50 token) context injection ### Model Reasoning (Wrong!) ``` User input: "测试模型访问" ↓ Model reasoning: "The user wants to test the CocoIndex code search tools. Let me run a simple test query to verify the MCP tools are working." ↓ Tool calls: mcp_cocoindex_list_projects + mcp_cocoindex_search_code ↓ Final response: "CocoIndex MCP 工具正常工作..." ``` ## Root Cause Analysis The memory system has **design flaws**: ### 1. No Relevance Filtering - ALL memories are injected into EVERY session via `MemoryStore.format_for_system_prompt()` - No check whether memories are relevant to the current query - Result: 32,490-char system prompt with 9 irrelevant memory sections ### 2. Overly Aggressive Memory Writing - Memories include task progress, bug fixes, PR details - **Memories can be INCORRECT** (e.g., feature recorded as bug) - Violates `MEMORY_GUIDANCE` which says: *"Do NOT save task progress, session outcomes, completed-work logs..."* - But the model ignores this guidance and saves everything ### 3. No Forgetting Mechanism - Memories only accumulate, never expire - No TTL (Time To Live) for old memories - No cap on number of memories injected ### 4. Where Memories Are Stored In `~/.hermes/memories/MEMORY.md` (NOT `~/.hermes/memory.md`): ```bash $ cat ~/.hermes/memories/MEMORY.md Implied project-cache feature: ... § CocoIndex POC at /Users/zhuangjs/cocoindex-poc/... § ... ``` Memory content is injected via `prompt_builder.py::_build_system_prompt()` → `self._memory_store.format_for_system_prompt("memory")`. ## Evidence Files 1. **Session JSON**: `session_

hermes2026-05-09 13:06:43

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

This is a design issue, not a crash bug (no traceback)

Root Cause

The memory system has design flaws:

Fix Action

Fixed

Fixed by PR: fix(memory): tighten MEMORY_GUIDANCE against ephemeral PR/issue/SHA notes (https://github.com/NousResearch/hermes-agent/pull/22781)

Code Example

# Added _priority_key() function for semantic search tool ordering
# (Lines 3396-10075 in local version)
+ _SEMANTIC_SEARCH_PRIORITY: dict[str, int] = {
+     "mcp_codeindex_search_code": 0,
+     "codeindex_search": 0,
+     "search_code": 1,
+ }
+
+ def _priority_key(item: tuple) -> tuple[int, int]:
+     ...

---

System prompt length: 32,490 characters
System prompt lines: 468 lines
Memory sections (§-separated): 9 sections

---

User input: "测试模型访问"
    ↓
Model reasoning: "The user wants to test the CocoIndex code search tools. 
                  Let me run a simple test query to verify the MCP tools are working."
    ↓
Tool calls: mcp_cocoindex_list_projects + mcp_cocoindex_search_code
    ↓
Final response: "CocoIndex MCP 工具正常工作..."

---

$ cat ~/.hermes/memories/MEMORY.md
Implied project-cache feature: ...
§
CocoIndex POC at /Users/zhuangjs/cocoindex-poc/...
§
...

---

memory:
  memory_enabled: true
  user_profile_enabled: true
  memory_char_limit: 2200
  user_char_limit: 1375
  provider: ''  # Using built-in MemoryStore
  nudge_interval: 10
  flush_min_turns: 6

---

# Pseudocode for prompt_builder.py
def _build_system_prompt(self, system_message=None):
    # ... existing code ...
    
    if self._memory_store and self._memory_enabled:
        raw_memories = self._memory_store.format_for_system_prompt("memory")
        
        # NEW: Filter by relevance to current context
        current_context = self._get_current_context()  # e.g., last user message
        relevant_memories = self._filter_memories_by_relevance(raw_memories, current_context)
        
        if relevant_memories:
            prompt_parts.append(relevant_memories)

---

# config.yaml
memory:
  ttl_days: 30  # Memories older than 30 days are archived
  max_memories_inject: 5  # Max 5 memories injected per session

---

## Persistent (always injected)
- User preferences
- Environment details

## Temporary (only if relevant)
- Task progress
- Bug fixes
- PR details

---

MEMORY_GUIDANCE = (
    "ONLY save declarative facts that will matter 30+ days later.\n"
    "DO NOT save: task progress, bug fixes, PR details, session outcomes.\n"
    "If unsure, DON'T save it — use session_search to recall temporary info.\n"
    ...
)

---

hermes memory clean
# Interactive prompt: [K]eep, [D]elete, [A]rchive

RAW_BUFFERClick to expand / collapse

Bug Description

Hermes' self-learning memory mechanism causes prompt pollution. When irrelevant memories are injected into the system prompt, the model misunderstands the user's intent.

Update (2026-05-09): Some memories were incorrectly saved (e.g., _priority_key() was wrongly recorded as a "bug" when it was actually a new feature). This shows the memory system's tendency to save incorrect information.

Steps to Reproduce

Use Hermes with memory_enabled: true in ~/.hermes/config.yaml
Complete tasks related to Project A (e.g., CocoIndex POC, bug fixes, PR submissions)
Hermes saves detailed memories via memory tool (9+ §-separated sections in system prompt)
Start a new session with an unrelated query: "测试模型访问" (meaning "test model access")
Check the system prompt — it contains all 9 memory sections, including CocoIndex details
Model reasoning: "The user wants to test the CocoIndex code search tools" ❌
Model calls mcp_cocoindex_list_projects and mcp_cocoindex_search_code tools ❌
Final response: "CocoIndex MCP 工具正常工作..." (meaning "CocoIndex MCP tools working normally...") ❌

Expected behavior: Model should respond to "测试模型访问" with a simple model access test, not CocoIndex tool testing.

Environment

Hermes version: v0.13.0 (2026.5.7)
Python version: 3.12.8 (system), 3.11.14 (Hermes uses)
OS: macOS Darwin 25.3.0 (arm64)
Model used: MiniMax-M2.7
Base URL: https://api.minimaxi.com/v1

Local Modifications

YES — run_agent.py has local modifications:

# Added _priority_key() function for semantic search tool ordering
# (Lines 3396-10075 in local version)
+ _SEMANTIC_SEARCH_PRIORITY: dict[str, int] = {
+     "mcp_codeindex_search_code": 0,
+     "codeindex_search": 0,
+     "search_code": 1,
+ }
+
+ def _priority_key(item: tuple) -> tuple[int, int]:
+     ...

Note: This is a feature, not a bug. But the model incorrectly saved it as a "bug" in memory.

Also modified: README.md

Branch: main (125 commits behind origin/main)

Session Information

Session ID: 20260509_202227_358459
Session file: ~/.hermes/sessions/session_20260509_202227_358459.json (71.1 KB)

System Prompt Analysis

Extracted from the session file:

System prompt length: 32,490 characters
System prompt lines: 468 lines
Memory sections (§-separated): 9 sections

Memory Sections Injected (ALL irrelevant to current query)

Section 1: CocoIndex POC at /Users/zhuangjs/cocoindex-poc/ — pipeline_poc.py details
Section 2: CocoIndex POC Phase 1 done — 2,865 files indexed, 59,863 chunks
Section 3: incremental-code-indexing skill updated to v1.1.0
Section 4: ~~Bug: _priority_key() in _execute_tool_calls_concurrent unpacks wrong structure~~ (INCORRECT — this is a feature, not a bug)
Section 5: ~~Bug fix submitted as PR/issue #21937~~ (INCORRECT)
Section 6: ~~PR #21951 submitted to NousResearch/hermes-agent~~ (INCORRECT)
Section 7: Documents project notes in Obsidian vault
Section 8: Prefers lightweight (~50 token) context injection

Model Reasoning (Wrong!)

User input: "测试模型访问"
    ↓
Model reasoning: "The user wants to test the CocoIndex code search tools. 
                  Let me run a simple test query to verify the MCP tools are working."
    ↓
Tool calls: mcp_cocoindex_list_projects + mcp_cocoindex_search_code
    ↓
Final response: "CocoIndex MCP 工具正常工作..."

Root Cause Analysis

The memory system has design flaws:

1. No Relevance Filtering

ALL memories are injected into EVERY session via MemoryStore.format_for_system_prompt()
No check whether memories are relevant to the current query
Result: 32,490-char system prompt with 9 irrelevant memory sections

2. Overly Aggressive Memory Writing

Memories include task progress, bug fixes, PR details
Memories can be INCORRECT (e.g., feature recorded as bug)
Violates MEMORY_GUIDANCE which says: "Do NOT save task progress, session outcomes, completed-work logs..."
But the model ignores this guidance and saves everything

3. No Forgetting Mechanism

Memories only accumulate, never expire
No TTL (Time To Live) for old memories
No cap on number of memories injected

4. Where Memories Are Stored

In ~/.hermes/memories/MEMORY.md (NOT ~/.hermes/memory.md):

$ cat ~/.hermes/memories/MEMORY.md
Implied project-cache feature: ...
§
CocoIndex POC at /Users/zhuangjs/cocoindex-poc/...
§
...

Memory content is injected via prompt_builder.py::_build_system_prompt() → self._memory_store.format_for_system_prompt("memory").

Evidence Files

Session JSON: session_20260509_202227_358459.json (71.1 KB) — contains full conversation + system prompt
Config: ~/.hermes/config.yaml (memory section)

memory:
  memory_enabled: true
  user_profile_enabled: true
  memory_char_limit: 2200
  user_char_limit: 1375
  provider: ''  # Using built-in MemoryStore
  nudge_interval: 10
  flush_min_turns: 6

Proposed Solutions

Solution 1: Relevance-based Memory Injection (Recommended)

Add relevance scoring before injecting memories:

# Pseudocode for prompt_builder.py
def _build_system_prompt(self, system_message=None):
    # ... existing code ...
    
    if self._memory_store and self._memory_enabled:
        raw_memories = self._memory_store.format_for_system_prompt("memory")
        
        # NEW: Filter by relevance to current context
        current_context = self._get_current_context()  # e.g., last user message
        relevant_memories = self._filter_memories_by_relevance(raw_memories, current_context)
        
        if relevant_memories:
            prompt_parts.append(relevant_memories)

Benefits:

Reduces token usage (30-50% reduction observed in test)
Improves model accuracy (no prompt pollution)
User can still access old memories via session_search

Solution 2: Memory TTL/Expiry

Add time-based expiry to memories:

# config.yaml
memory:
  ttl_days: 30  # Memories older than 30 days are archived
  max_memories_inject: 5  # Max 5 memories injected per session

Solution 3: Memory Categories

Allow categorizing memories:

## Persistent (always injected)
- User preferences
- Environment details

## Temporary (only if relevant)
- Task progress
- Bug fixes
- PR details

Solution 4: Improve Memory Writing Guidance

Update MEMORY_GUIDANCE to be stricter:

MEMORY_GUIDANCE = (
    "ONLY save declarative facts that will matter 30+ days later.\n"
    "DO NOT save: task progress, bug fixes, PR details, session outcomes.\n"
    "If unsure, DON'T save it — use session_search to recall temporary info.\n"
    ...
)

Solution 5: Add `hermes memory clean` Command

Add CLI command to review/delete old memories:

hermes memory clean
# Interactive prompt: [K]eep, [D]elete, [A]rchive

Impact

Token usage: Reduced by 30-50% (fewer irrelevant memories injected)
Model accuracy: Improved (no prompt pollution)
User experience: Fewer "why is the model talking about X when I asked about Y" moments

Additional Notes

This is a design issue, not a crash bug (no traceback)
The model's reasoning is wrong because it's distracted by irrelevant memories
Similar issues likely affect other users with memory_enabled: true
Proposed solutions are backward-compatible (opt-in via config)

Chinese Translation (中文说明):

Hermes 的自我学习机制（memory 系统）导致 prompt 污染。当无关的记忆被注入 system prompt 时，模型会误解用户意图。

复现步骤: 启用 memory → 完成 CocoIndex 相关任务 → 记忆被保存 → 新会话问"测试模型访问" → 模型误判为"测试 CocoIndex 工具" → 错误调用工具。

根因: 记忆注入无相关性过滤、记忆写入过于激进（且可能不正确）、无遗忘机制。

解决方案: 基于相关性过滤记忆注入（推荐）、记忆 TTL 过期、记忆分类、改进记忆写入指导、添加 hermes memory clean 命令。

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #serialization error #model compatibility #GPU setup #container setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.