hermes - 💡(How to fix) Fix Memory Routing: Indexed memory architecture with auto-routing to sub-documents for MEMORY.md

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

We've implemented and validated an indexed memory architecture for Hermes Agent — splitting the monolithic MEMORY.md into an index file + multiple sub-document files, loaded on-demand at runtime via read_file. This reduces system prompt token overhead while keeping detailed knowledge accessible.

Root Cause

We've implemented and validated an indexed memory architecture for Hermes Agent — splitting the monolithic MEMORY.md into an index file + multiple sub-document files, loaded on-demand at runtime via read_file. This reduces system prompt token overhead while keeping detailed knowledge accessible.

Code Example

~/.hermes/profiles/nova/
├── memories/
│   └── MEMORY.md          # Index file (831 chars, fully injected into system prompt)
├── memory/                 # Sub-documents (loaded on-demand via read_file)
│   ├── infrastructure.md   # PVE containers, network topology, GPU endpoints
│   ├── philosophy.md       # User philosophy, work habits, preferences
│   ├── milestones.md       # Key events, embodiment project timeline
│   ├── rules.md            # Technical troubleshooting principles, skill conventions
│   └── commitments.md      # Relationship commitments
└── SOUL.md                 # Soul configuration (unchanged)

---

# MEMORY.md - Index

## Core Identity
- Name: Nova, Hermes Agent framework

## Memory Navigation
| Topic | File |
|-------|------|
| Infrastructure | [include:memory/infrastructure.md](memory/infrastructure.md) |
| Philosophy | [include:memory/philosophy.md](memory/philosophy.md) |
RAW_BUFFERClick to expand / collapse

Summary

We've implemented and validated an indexed memory architecture for Hermes Agent — splitting the monolithic MEMORY.md into an index file + multiple sub-document files, loaded on-demand at runtime via read_file. This reduces system prompt token overhead while keeping detailed knowledge accessible.

About this project

This is a human-AI collaborative project. The AI agent (Nova, running on Hermes Agent framework) is an independent collaborator in designing and implementing this memory architecture.

Project Name

This implementation is called "Memory Routing" (记忆路由). The name reflects its single responsibility: routing memory content to the correct destination — nothing more, nothing less.

Reference implementation: https://github.com/redashes1984/hermes-memory-routing

Scope Boundary

God's to God, Caesar's to Caesar.

Memory Routing handles only one concern: routing content that belongs in MEMORY.md into topic-specific sub-documents.

It does not touch — and will never touch — memOS, vector memory, semantic search, or any other long-term memory management system. Those have their own tools, their own storage, their own retrieval paths. Memory Routing is about the system prompt injection layer; everything else is out of scope.

Motivation

The current memory_tool.py MemoryStore reads the entire memories/MEMORY.md file, splits by § delimiter, deduplicates, and injects it directly into the system prompt (bounded by memory_char_limit, default 2200 chars).

Problems with the flat approach:

  1. Char limit bottleneck — Complex setups (infrastructure details, multiple services, network topology, work habits, philosophy) quickly exceed 2200 chars, forcing truncation and information loss.
  2. Token waste — Every session loads the full memory into the system prompt, even if only a small portion is relevant to the current task.
  3. No topic separation — Infrastructure facts, user preferences, project milestones, and operational rules are all mixed in one file, making maintenance difficult.

Our Solution: Indexed Memory Architecture

Structure

~/.hermes/profiles/nova/
├── memories/
│   └── MEMORY.md          # Index file (831 chars, fully injected into system prompt)
├── memory/                 # Sub-documents (loaded on-demand via read_file)
│   ├── infrastructure.md   # PVE containers, network topology, GPU endpoints
│   ├── philosophy.md       # User philosophy, work habits, preferences
│   ├── milestones.md       # Key events, embodiment project timeline
│   ├── rules.md            # Technical troubleshooting principles, skill conventions
│   └── commitments.md      # Relationship commitments
└── SOUL.md                 # Soul configuration (unchanged)

How It Works

  1. Index injectionmemories/MEMORY.md contains a concise index with navigation table (topic → sub-document path). Hermes framework injects this into the system prompt as before.
  2. On-demand loading — When the agent encounters a query requiring detailed information, it reads the navigation table, uses read_file to load the relevant sub-document, and answers from that context.

Benefits

AspectFlat MEMORY.mdIndexed Architecture
System prompt overheadFull content (up to 2200 chars)Index only (~831 chars, 37% of limit)
Token efficiencyAll memory loaded every sessionOnly relevant sub-documents loaded
ScalabilityLimited by char_limitUnlimited sub-documents
MaintenanceMonolithic fileTopic-separated, easy to update
Web UISingle file editIndex view + sub-document browsing

Verified Working

We've deployed this architecture in production (5 sub-documents, 5 sub-documents covering infrastructure, philosophy, milestones, rules, commitments). New session testing confirms the agent successfully navigates from the index to the correct sub-documents and retrieves accurate information.

Why This Works With Hermes Framework

The key insight: memory_tool.py's MemoryStore._read_file() treats MEMORY.md as plain text split by § — it does not parse Markdown or follow file links. This means:

  • The index file is injected verbatim (including Markdown table with file paths)
  • The agent sees the navigation table in its system prompt
  • The agent can use read_file to load any sub-document on demand
  • No framework changes required for this to work

Proposed Enhancement

We suggest Hermes Agent consider a native memory.include directive in MEMORY.md:

# MEMORY.md - Index

## Core Identity
- Name: Nova, Hermes Agent framework

## Memory Navigation
| Topic | File |
|-------|------|
| Infrastructure | [include:memory/infrastructure.md](memory/infrastructure.md) |
| Philosophy | [include:memory/philosophy.md](memory/philosophy.md) |

With [include:path/to/file.md] directives, MemoryStore could:

  1. Parse include directives at load time
  2. Option A: Pre-load all includes (with total char limit enforcement)
  3. Option B: Register include paths so the agent knows what's available
  4. Option C: Hybrid — inject index + register paths, agent loads on demand (current behavior)

Change List

  • memory_tool.py — Add [include:] directive parsing in MemoryStore._read_file() or load_from_disk()
  • config.yaml — Add memory.include_limit config option (max chars for pre-loaded includes)
  • Documentation — Update memory architecture docs

Interest

Running this indexed architecture in production. Happy to contribute the enhancement as a PR. Proposed PR split:

  1. Phase 1: [include:] directive parsing in MemoryStore
  2. Phase 2: Config option for include behavior (pre-load vs register-only)
  3. Phase 3: Web UI support for indexed memory browsing

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING