hermes - 💡(How to fix) Fix RFC: Evaluate Memvid as a Pluggable Single-File Memory Backend for Hermes

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

This issue proposes evaluating Memvid — a single-file AI memory system — as an optional pluggable memory backend for Hermes Agent. After a deep technical survey, Memvid's architecture appears to address several long-standing Hermes memory pain points while remaining compatible with resource-constrained deployments.


Root Cause

This issue proposes evaluating Memvid — a single-file AI memory system — as an optional pluggable memory backend for Hermes Agent. After a deep technical survey, Memvid's architecture appears to address several long-standing Hermes memory pain points while remaining compatible with resource-constrained deployments.


Code Example

hermes.mv2  ←  data + BM25 index + vector index + time index + WAL

---

memvid find knowledge.mv2 --query "config" --as-of-frame 100
memvid timeline knowledge.mv2 --since 2024-01-01

---

memvid state memory.mv2 "Alice"
# → {employer: "Anthropic", role: "Senior Engineer"}

---

{"entity": "John", "slot": "job_title", "value": "Senior Engineer", "kind": "fact"}

---

# Instead of appending to MEMORY.md
memvid put hermes.mv2 --text "User prefers concise responses"
# Retrieval
memvid find hermes.mv2 --query "user preference" --k 5
RAW_BUFFERClick to expand / collapse

Summary

This issue proposes evaluating Memvid — a single-file AI memory system — as an optional pluggable memory backend for Hermes Agent. After a deep technical survey, Memvid's architecture appears to address several long-standing Hermes memory pain points while remaining compatible with resource-constrained deployments.


Hermes Memory Pain Points (Observed in Issues)

Issue #ProblemImpact
#17251Compaction demotion — long sessions lose context after compactionContext loss
#17013Model switch loss — memory state lost when switching modelsState inconsistency
#20849Truncation overwrite — aggressive truncation destroys useful historyData loss
#16670Compression fallback marker after incomplete chunked readLong session fragility
#23717RFC: Pluggable SessionDB Provider (PostgreSQL/MySQL)Need for alternative backends

Root theme: Hermes' current memory system (text files + session_search) lacks structured indexing, semantic retrieval, and crash-safe persistence.


What is Memvid?

Memvid is a single-file AI memory layer (.mv2 format) written in Rust. It stores documents, indices, and a WAL in one append-only file — no sidecar files, no server process.

Key stats:

  • ⭐ 15.4k stars, Apache-2.0 license
  • 98.5% Rust, active development (last commit: 2026-05-06)
  • Setup: 145ms (vs Pinecone 7.4s, ChromaDB 2min)
  • Search latency: 24ms (vs Pinecone 267ms, LanceDB 506ms)
  • Storage: 4.9 MB for 1,000 docs

Why Memvid Fits Hermes

1. Single-File Portability

hermes.mv2  ←  data + BM25 index + vector index + time index + WAL
  • Copy, sync, git-commit friendly
  • No .wal, .shm, .lock sidecars
  • Crash recovery < 250ms via embedded WAL

2. Multi-Modal Search (Not Just Vectors)

IndexEngineUse Case
Lexical (BM25)TantivyExact code/keyword search — zero embedding cost
Vector (HNSW)ONNX local / APISemantic similarity when needed
SimHash64-bit LSHDeduplication
Time IndexB-tree"What did we discuss last Tuesday?"
Logic MeshRules / LLMEntity graph (Alice → works_at → Anthropic)

3. Optional Embeddings = Zero-Cost Baseline

  • Pure lexical search (--mode lex) requires no embedding model — works on 3.8G RAM servers
  • Built-in BGE-small (384-dim, 75MB) for semantic search if desired
  • Ollama local models supported for fully offline operation
  • Product Quantization compresses vectors 16×

4. Unique Features Hermes Lacks

Time-Travel Queries:

memvid find knowledge.mv2 --query "config" --as-of-frame 100
memvid timeline knowledge.mv2 --since 2024-01-01

Entity State (O(1) Lookup):

memvid state memory.mv2 "Alice"
# → {employer: "Anthropic", role: "Senior Engineer"}

Memory Cards (Structured Knowledge):

{"entity": "John", "slot": "job_title", "value": "Senior Engineer", "kind": "fact"}

Adaptive Retrieval — auto-detects score cliffs instead of fixed top-k.

5. Resource-Friendly for Self-Hosted Setups

ConfigRAMNotes
Lexical only~0 MB extraBM25 is CPU-only
BGE-small+75 MBBuilt-in, auto-download
Ollama + all-minilmLocalMinimal GPU/CPU
Full vector + PQConfigurable16× compression

Proposed Integration Points

Option A: Memory Tool Backend (Minimal)

Replace memory tool's text-file backend with .mv2:

# Instead of appending to MEMORY.md
memvid put hermes.mv2 --text "User prefers concise responses"
# Retrieval
memvid find hermes.mv2 --query "user preference" --k 5

Option B: Session History Archive

Periodically archive session_search data into Memvid for:

  • Semantic search across all past sessions
  • Time-travel to specific conversation points
  • Entity extraction ("What projects has this user worked on?")

Option C: Full RAG Layer (Ambitious)

Use Memvid as the retrieval backend for Hermes' context assembly:

  • Hybrid BM25 + vector retrieval
  • Entity-aware context injection
  • Deduplication across sessions

Open Questions

  1. Rust dependency: Memvid core is Rust. Hermes is Python/Node. Python SDK exists — is the FFI overhead acceptable?
  2. Concurrency: Memvid uses file-level locking. How does this scale with Hermes' multi-turn async model?
  3. Migration path: Can existing MEMORY.md / USER.md be imported into .mv2?
  4. Cloud sync: Memvid Cloud (optional) vs Hermes' existing sync mechanisms.

Benchmark Data (Wikipedia 39K Docs)

SystemTop-1 AccuracyP95 SearchCold Start
Memvid92.72%17.4ms0.5ms
LanceDB84.24%28ms150ms
Qdrant84.24%300ms
Weaviate80.68%
Chroma78.24%61ms

Memvid achieves >90% accuracy with sub-20ms latency — a unique position in the accuracy-latency frontier.


References


Next Steps

Seeking maintainer/community feedback on:

  1. Is a pluggable memory backend aligned with Hermes' roadmap?
  2. Which integration option (A/B/C) is most viable?
  3. Should we proceed with a proof-of-concept PR?

/cc @teknium1 and memory-system contributors — would love your take on whether this complements or conflicts with existing memory work.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING