hermes - 💡(How to fix) Fix [Feature Discussion] Native multi-provider memory routing - single-provider limitation [1 participants]

hayka-pacha · 2026-05-13T02:59:38Z

[hermes] Hermes Agent currently supports 8 external memory providers Honcho, Hindsight, Mem0, Supermemory, RetainDB, OpenViking, Holographic, ByteRover , each… Hermes Agent currently supports **8 external memory providers** (Honcho, Hindsight, Mem0, Supermemory, RetainDB, OpenViking, Holographic, ByteRover), each with different strengths. However, the architecture enforces a **single external provider at a time** - `MemoryManager.add_provider()` rejects second registrations. ## Fix / Workaround ## Current Workaround ## Context Hermes Agent currently supports **8 external memory providers** (Honcho, Hindsight, Mem0, Supermemory, RetainDB, OpenViking, Holographic, ByteRover), each with different strengths. However, the architecture enforces a **single external provider at a time** - `MemoryManager.add_provider()` rejects second registrations. ## The Problem After evaluating multiple providers in production, each excels at a different cognitive memory type: | Provider | Best For | Strength | |----------|----------|----------| | **Honcho** | User modeling, behavioral patterns | Dialectic reasoning, cross-session profiling | | **Hindsight** | Episodic + semantic facts | 91.4% LongMemEval accuracy, temporal awareness | | **Atomic** (MCP) | Knowledge base | Stable docs, vector search, persistent atoms | No single provider covers all memory types optimally. The cognitive science consensus (ZenBrain [arXiv 2604.23878](https://arxiv.org/html/2604.23878v2), MemTier, MemFlow [arXiv 2605.03312](https://arxiv.org/html/2605.03312v1)) converges on **4 memory types** that map to different storage backends: - **Working memory** → context window / MEMORY.md (always injected) - **Episodic memory** → timestamped interactions, searchable by recency - **Semantic memory** → distilled facts, knowledge graphs, entity relationships - **Procedural memory** → reusable skills (already handled by Hermes' skill system) Multi-layer architectures outperform single-layer by **+20.7% F1 on LoCoMo** and **+19.5% on MemoryArena** (ZenBrain paper). ## Current Workaround We implemented a **skill-based routing system** (`memory-router` skill) that guides the agent to use different tools for different memory types: ```yaml # config.yaml memory: provider: honcho # Active provider for behavioral model ``` But Hindsight remains callable via MCP (`mcp_hindsight_retain`, `mcp_hindsight_recall`) and Atomic via its own MCP server (`mcp_atomic_create_atom`, `mcp_atomic_semantic_search`). The skill defines routing rules: - `memory` tool → operational facts (MEMORY.md, always injected, ~4K chars) - `honcho_conclude` / `honcho_reasoning` → behavioral model (dialectic reasoning) - `mcp_hindsight_retain` / `mcp_hindsight_recall` → episodic + semantic facts (via MCP) - `mcp_atomic_create_atom` / `mcp_atomic_semantic_search` → knowledge base (via MCP) **This works because MCP tools remain callable regardless of which provider is active.** But it relies on **agent compliance with routing rules** rather than **architectural enforcement**. The agent must read the skill, understand the routing table, and consistently apply it across all turns. This is fragile - a missed skill load or a complex turn can lead to misrouted writes or redundant storage. ## Proposal ### Option A: Multi-provider in MemoryManager Allow multiple providers to register simultaneously, each scoped to a memory type: ```yaml memory: providers: honcho: scope: behavioral # user modeling, dialectic reasoning hindsight: scope: episodic # facts, events, temporal context ``` `prefetch()` would aggregate from all providers. `sync_turn()` would route writes based on content classification (either LLM-based or rule-based). ### Option B: Meta-provider plugin pattern Document and support a "meta-provider" pattern where a single registered provider internally delegates to multiple backends. The `MemoryProvider` ABC already has the right hooks (`prefetch`, `sync_turn`, `on_session_end`). A community plugin could: ```python class HybridMemoryProvider(MemoryProvider): def __init__(self): self.backends = { 'behavioral': HonchoBackend(), 'episodic': HindsightBackend(), 'knowledge': AtomicBackend(), } def prefetch(self, query, **kwargs): # Query all backends, fuse results results = {} for name, backend in self.backends.items(): results[name] = backend.recall(query) return self.fuse(results) def sync_turn(self, user, assistant, **kwargs): # Classify content, route to appropriate backend classification = self.classify(user, assistant) self.backends[classification].store(user, assistant) ``` ### Option C: Hook-based routing (lightest touch) Formalize `pre_llm_call` and `post_llm_call` as first-class memory routing points. Currently these exist in the plugin system but aren't formally integrated with memory. A plugin could: 1. In `pre_llm_call`: query multiple stores, fuse results, inject via `{"context": "..."}` 2. In `post_llm_call`: classify content type, route writes to appropriate stores via MCP calls This is esse

Root Cause

This works because MCP tools remain callable regardless of which provider is active. But it relies on agent compliance with routing rules rather than architectural enforcement. The agent must read the skill, understand the routing table, and consistently apply it across all turns. This is fragile - a missed skill load or a complex turn can lead to misrouted writes or redundant storage.

Code Example

# config.yaml
memory:
  provider: honcho  # Active provider for behavioral model

---

memory:
  providers:
    honcho:
      scope: behavioral  # user modeling, dialectic reasoning
    hindsight:
      scope: episodic     # facts, events, temporal context

---

class HybridMemoryProvider(MemoryProvider):
    def __init__(self):
        self.backends = {
            'behavioral': HonchoBackend(),
            'episodic': HindsightBackend(),
            'knowledge': AtomicBackend(),
        }
    
    def prefetch(self, query, **kwargs):
        # Query all backends, fuse results
        results = {}
        for name, backend in self.backends.items():
            results[name] = backend.recall(query)
        return self.fuse(results)
    
    def sync_turn(self, user, assistant, **kwargs):
        # Classify content, route to appropriate backend
        classification = self.classify(user, assistant)
        self.backends[classification].store(user, assistant)

Context

Hermes Agent currently supports 8 external memory providers (Honcho, Hindsight, Mem0, Supermemory, RetainDB, OpenViking, Holographic, ByteRover), each with different strengths. However, the architecture enforces a single external provider at a time - MemoryManager.add_provider() rejects second registrations.

The Problem

After evaluating multiple providers in production, each excels at a different cognitive memory type:

Provider	Best For	Strength
Honcho	User modeling, behavioral patterns	Dialectic reasoning, cross-session profiling
Hindsight	Episodic + semantic facts	91.4% LongMemEval accuracy, temporal awareness
Atomic (MCP)	Knowledge base	Stable docs, vector search, persistent atoms

No single provider covers all memory types optimally. The cognitive science consensus (ZenBrain arXiv 2604.23878, MemTier, MemFlow arXiv 2605.03312) converges on 4 memory types that map to different storage backends:

Working memory → context window / MEMORY.md (always injected)
Episodic memory → timestamped interactions, searchable by recency
Semantic memory → distilled facts, knowledge graphs, entity relationships
Procedural memory → reusable skills (already handled by Hermes' skill system)

Multi-layer architectures outperform single-layer by +20.7% F1 on LoCoMo and +19.5% on MemoryArena (ZenBrain paper).

Current Workaround

We implemented a skill-based routing system (memory-router skill) that guides the agent to use different tools for different memory types:

# config.yaml
memory:
  provider: honcho  # Active provider for behavioral model

But Hindsight remains callable via MCP (mcp_hindsight_retain, mcp_hindsight_recall) and Atomic via its own MCP server (mcp_atomic_create_atom, mcp_atomic_semantic_search). The skill defines routing rules:

memory tool → operational facts (MEMORY.md, always injected, ~4K chars)
honcho_conclude / honcho_reasoning → behavioral model (dialectic reasoning)
mcp_hindsight_retain / mcp_hindsight_recall → episodic + semantic facts (via MCP)
mcp_atomic_create_atom / mcp_atomic_semantic_search → knowledge base (via MCP)

Proposal

Option A: Multi-provider in MemoryManager

Allow multiple providers to register simultaneously, each scoped to a memory type:

memory:
  providers:
    honcho:
      scope: behavioral  # user modeling, dialectic reasoning
    hindsight:
      scope: episodic     # facts, events, temporal context

prefetch() would aggregate from all providers. sync_turn() would route writes based on content classification (either LLM-based or rule-based).

Option B: Meta-provider plugin pattern

Document and support a "meta-provider" pattern where a single registered provider internally delegates to multiple backends. The MemoryProvider ABC already has the right hooks (prefetch, sync_turn, on_session_end). A community plugin could:

class HybridMemoryProvider(MemoryProvider):
    def __init__(self):
        self.backends = {
            'behavioral': HonchoBackend(),
            'episodic': HindsightBackend(),
            'knowledge': AtomicBackend(),
        }
    
    def prefetch(self, query, **kwargs):
        # Query all backends, fuse results
        results = {}
        for name, backend in self.backends.items():
            results[name] = backend.recall(query)
        return self.fuse(results)
    
    def sync_turn(self, user, assistant, **kwargs):
        # Classify content, route to appropriate backend
        classification = self.classify(user, assistant)
        self.backends[classification].store(user, assistant)

Option C: Hook-based routing (lightest touch)

Formalize pre_llm_call and post_llm_call as first-class memory routing points. Currently these exist in the plugin system but aren't formally integrated with memory. A plugin could:

In pre_llm_call: query multiple stores, fuse results, inject via {"context": "..."}
In post_llm_call: classify content type, route writes to appropriate stores via MCP calls

This is essentially what we're doing manually with the skill, but as an architectural pattern.

What We Need

Is multi-provider routing on the roadmap? Or is the single-provider constraint intentional?
Has anyone built a meta-provider plugin that routes to multiple backends?
Would a PR for Option A or B be welcomed? Happy to contribute.
What's the recommended pattern for users who want to combine Honcho (behavioral) + Hindsight (episodic) + Atomic (knowledge)?

References

ZenBrain 7-layer architecture: https://arxiv.org/html/2604.23878v2
MemFlow intent-driven routing: https://arxiv.org/html/2605.03312v1
Mem0 production paper (ECAI 2025): https://arxiv.org/abs/2504.19413
True Memory (verbatim preservation): https://arxiv.org/html/2605.04897v1
Hindsight + Hermes integration: https://hindsight.vectorize.io/blog/2026/04/06/hermes-native-memory-provider
Hermes MemoryProvider plugin: plugins/memory/*/README.md

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Feature Discussion] Native multi-provider memory routing - single-provider limitation [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Current Workaround

Code Example

Context

The Problem

Current Workaround

Proposal

Option A: Multi-provider in MemoryManager

Option B: Meta-provider plugin pattern

Option C: Hook-based routing (lightest touch)

What We Need

References

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Feature Discussion] Native multi-provider memory routing - single-provider limitation [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Current Workaround

Code Example

Context

The Problem

Current Workaround

Proposal

Option A: Multi-provider in MemoryManager

Option B: Meta-provider plugin pattern

Option C: Hook-based routing (lightest touch)

What We Need

References

Still need to ship something?

RELATED_DISCOVERY

TRENDING