hermes - 💡(How to fix) Fix [Feature Discussion] Native multi-provider memory routing - single-provider limitation [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#24770Fetched 2026-05-14 03:51:54
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
labeled ×4renamed ×1

Hermes Agent currently supports 8 external memory providers (Honcho, Hindsight, Mem0, Supermemory, RetainDB, OpenViking, Holographic, ByteRover), each with different strengths. However, the architecture enforces a single external provider at a time - MemoryManager.add_provider() rejects second registrations.

Root Cause

This works because MCP tools remain callable regardless of which provider is active. But it relies on agent compliance with routing rules rather than architectural enforcement. The agent must read the skill, understand the routing table, and consistently apply it across all turns. This is fragile - a missed skill load or a complex turn can lead to misrouted writes or redundant storage.

Fix Action

Fix / Workaround

Current Workaround

Code Example

# config.yaml
memory:
  provider: honcho  # Active provider for behavioral model

---

memory:
  providers:
    honcho:
      scope: behavioral  # user modeling, dialectic reasoning
    hindsight:
      scope: episodic     # facts, events, temporal context

---

class HybridMemoryProvider(MemoryProvider):
    def __init__(self):
        self.backends = {
            'behavioral': HonchoBackend(),
            'episodic': HindsightBackend(),
            'knowledge': AtomicBackend(),
        }
    
    def prefetch(self, query, **kwargs):
        # Query all backends, fuse results
        results = {}
        for name, backend in self.backends.items():
            results[name] = backend.recall(query)
        return self.fuse(results)
    
    def sync_turn(self, user, assistant, **kwargs):
        # Classify content, route to appropriate backend
        classification = self.classify(user, assistant)
        self.backends[classification].store(user, assistant)
RAW_BUFFERClick to expand / collapse

Context

Hermes Agent currently supports 8 external memory providers (Honcho, Hindsight, Mem0, Supermemory, RetainDB, OpenViking, Holographic, ByteRover), each with different strengths. However, the architecture enforces a single external provider at a time - MemoryManager.add_provider() rejects second registrations.

The Problem

After evaluating multiple providers in production, each excels at a different cognitive memory type:

ProviderBest ForStrength
HonchoUser modeling, behavioral patternsDialectic reasoning, cross-session profiling
HindsightEpisodic + semantic facts91.4% LongMemEval accuracy, temporal awareness
Atomic (MCP)Knowledge baseStable docs, vector search, persistent atoms

No single provider covers all memory types optimally. The cognitive science consensus (ZenBrain arXiv 2604.23878, MemTier, MemFlow arXiv 2605.03312) converges on 4 memory types that map to different storage backends:

  • Working memory → context window / MEMORY.md (always injected)
  • Episodic memory → timestamped interactions, searchable by recency
  • Semantic memory → distilled facts, knowledge graphs, entity relationships
  • Procedural memory → reusable skills (already handled by Hermes' skill system)

Multi-layer architectures outperform single-layer by +20.7% F1 on LoCoMo and +19.5% on MemoryArena (ZenBrain paper).

Current Workaround

We implemented a skill-based routing system (memory-router skill) that guides the agent to use different tools for different memory types:

# config.yaml
memory:
  provider: honcho  # Active provider for behavioral model

But Hindsight remains callable via MCP (mcp_hindsight_retain, mcp_hindsight_recall) and Atomic via its own MCP server (mcp_atomic_create_atom, mcp_atomic_semantic_search). The skill defines routing rules:

  • memory tool → operational facts (MEMORY.md, always injected, ~4K chars)
  • honcho_conclude / honcho_reasoning → behavioral model (dialectic reasoning)
  • mcp_hindsight_retain / mcp_hindsight_recall → episodic + semantic facts (via MCP)
  • mcp_atomic_create_atom / mcp_atomic_semantic_search → knowledge base (via MCP)

This works because MCP tools remain callable regardless of which provider is active. But it relies on agent compliance with routing rules rather than architectural enforcement. The agent must read the skill, understand the routing table, and consistently apply it across all turns. This is fragile - a missed skill load or a complex turn can lead to misrouted writes or redundant storage.

Proposal

Option A: Multi-provider in MemoryManager

Allow multiple providers to register simultaneously, each scoped to a memory type:

memory:
  providers:
    honcho:
      scope: behavioral  # user modeling, dialectic reasoning
    hindsight:
      scope: episodic     # facts, events, temporal context

prefetch() would aggregate from all providers. sync_turn() would route writes based on content classification (either LLM-based or rule-based).

Option B: Meta-provider plugin pattern

Document and support a "meta-provider" pattern where a single registered provider internally delegates to multiple backends. The MemoryProvider ABC already has the right hooks (prefetch, sync_turn, on_session_end). A community plugin could:

class HybridMemoryProvider(MemoryProvider):
    def __init__(self):
        self.backends = {
            'behavioral': HonchoBackend(),
            'episodic': HindsightBackend(),
            'knowledge': AtomicBackend(),
        }
    
    def prefetch(self, query, **kwargs):
        # Query all backends, fuse results
        results = {}
        for name, backend in self.backends.items():
            results[name] = backend.recall(query)
        return self.fuse(results)
    
    def sync_turn(self, user, assistant, **kwargs):
        # Classify content, route to appropriate backend
        classification = self.classify(user, assistant)
        self.backends[classification].store(user, assistant)

Option C: Hook-based routing (lightest touch)

Formalize pre_llm_call and post_llm_call as first-class memory routing points. Currently these exist in the plugin system but aren't formally integrated with memory. A plugin could:

  1. In pre_llm_call: query multiple stores, fuse results, inject via {"context": "..."}
  2. In post_llm_call: classify content type, route writes to appropriate stores via MCP calls

This is essentially what we're doing manually with the skill, but as an architectural pattern.

What We Need

  1. Is multi-provider routing on the roadmap? Or is the single-provider constraint intentional?
  2. Has anyone built a meta-provider plugin that routes to multiple backends?
  3. Would a PR for Option A or B be welcomed? Happy to contribute.
  4. What's the recommended pattern for users who want to combine Honcho (behavioral) + Hindsight (episodic) + Atomic (knowledge)?

References

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING