hermes - 💡(How to fix) Fix RFC: Evaluate Memvid as a Pluggable Single-File Memory Backend for Hermes

StepCodex · 2026-05-11T15:56:20Z

[hermes] This issue proposes evaluating Memvid https://github.com/memvid/memvid — a single-file AI memory system — as an optional pluggable memory backend for… This issue proposes evaluating [Memvid](https://github.com/memvid/memvid) — a single-file AI memory system — as an optional pluggable memory backend for Hermes Agent. After a deep technical survey, Memvid's architecture appears to address several long-standing Hermes memory pain points while remaining compatible with resource-constrained deployments. --- ## Summary This issue proposes evaluating [Memvid](https://github.com/memvid/memvid) — a single-file AI memory system — as an optional pluggable memory backend for Hermes Agent. After a deep technical survey, Memvid's architecture appears to address several long-standing Hermes memory pain points while remaining compatible with resource-constrained deployments. --- ## Hermes Memory Pain Points (Observed in Issues) | Issue # | Problem | Impact | |---------|---------|--------| | #17251 | Compaction demotion — long sessions lose context after compaction | Context loss | | #17013 | Model switch loss — memory state lost when switching models | State inconsistency | | #20849 | Truncation overwrite — aggressive truncation destroys useful history | Data loss | | #16670 | Compression fallback marker after incomplete chunked read | Long session fragility | | #23717 | RFC: Pluggable SessionDB Provider (PostgreSQL/MySQL) | Need for alternative backends | **Root theme**: Hermes' current memory system (text files + session_search) lacks structured indexing, semantic retrieval, and crash-safe persistence. --- ## What is Memvid? Memvid is a **single-file AI memory layer** (`.mv2` format) written in Rust. It stores documents, indices, and a WAL in one append-only file — no sidecar files, no server process. **Key stats:** - ⭐ 15.4k stars, Apache-2.0 license - 98.5% Rust, active development (last commit: 2026-05-06) - Setup: **145ms** (vs Pinecone 7.4s, ChromaDB 2min) - Search latency: **24ms** (vs Pinecone 267ms, LanceDB 506ms) - Storage: **4.9 MB for 1,000 docs** --- ## Why Memvid Fits Hermes ### 1. Single-File Portability ``` hermes.mv2 ← data + BM25 index + vector index + time index + WAL ``` - Copy, sync, git-commit friendly - No `.wal`, `.shm`, `.lock` sidecars - Crash recovery < 250ms via embedded WAL ### 2. Multi-Modal Search (Not Just Vectors) | Index | Engine | Use Case | |-------|--------|----------| | **Lexical (BM25)** | Tantivy | Exact code/keyword search — **zero embedding cost** | | **Vector (HNSW)** | ONNX local / API | Semantic similarity when needed | | **SimHash** | 64-bit LSH | Deduplication | | **Time Index** | B-tree | "What did we discuss last Tuesday?" | | **Logic Mesh** | Rules / LLM | Entity graph (Alice → works_at → Anthropic) | ### 3. Optional Embeddings = Zero-Cost Baseline - **Pure lexical search** (`--mode lex`) requires **no embedding model** — works on 3.8G RAM servers - **Built-in BGE-small** (384-dim, 75MB) for semantic search if desired - **Ollama local models** supported for fully offline operation - **Product Quantization** compresses vectors 16× ### 4. Unique Features Hermes Lacks **Time-Travel Queries:** ```bash memvid find knowledge.mv2 --query "config" --as-of-frame 100 memvid timeline knowledge.mv2 --since 2024-01-01 ``` **Entity State (O(1) Lookup):** ```bash memvid state memory.mv2 "Alice" # → {employer: "Anthropic", role: "Senior Engineer"} ``` **Memory Cards (Structured Knowledge):** ```json {"entity": "John", "slot": "job_title", "value": "Senior Engineer", "kind": "fact"} ``` **Adaptive Retrieval** — auto-detects score cliffs instead of fixed top-k. ### 5. Resource-Friendly for Self-Hosted Setups | Config | RAM | Notes | |--------|-----|-------| | Lexical only | ~0 MB extra | BM25 is CPU-only | | BGE-small | +75 MB | Built-in, auto-download | | Ollama + all-minilm | Local | Minimal GPU/CPU | | Full vector + PQ | Configurable | 16× compression | --- ## Proposed Integration Points ### Option A: Memory Tool Backend (Minimal) Replace `memory` tool's text-file backend with `.mv2`: ```python # Instead of appending to MEMORY.md memvid put hermes.mv2 --text "User prefers concise responses" # Retrieval memvid find hermes.mv2 --query "user preference" --k 5 ``` ### Option B: Session History Archive Periodically archive `session_search` data into Memvid for: - Semantic search across all past sessions - Time-travel to specific conversation points - Entity extraction ("What projects has this user worked on?") ### Option C: Full RAG Layer (Ambitious) Use Memvid as the retrieval backend for Hermes' context assembly: - Hybrid BM25 + vector retrieval - Entity-aware context injection - Deduplication across sessions --- ## Open Questions 1. **Rust dependency**: Memvid core is Rust. Hermes is Python/Node. Python SDK exists — is the FFI overhead acceptable? 2. **Concurrency**: Memvid uses file-level locking. How does this scale with Hermes' multi-turn async model? 3. **Migration path**: Can

Root Cause

This issue proposes evaluating Memvid — a single-file AI memory system — as an optional pluggable memory backend for Hermes Agent. After a deep technical survey, Memvid's architecture appears to address several long-standing Hermes memory pain points while remaining compatible with resource-constrained deployments.

Code Example

hermes.mv2  ←  data + BM25 index + vector index + time index + WAL

---

memvid find knowledge.mv2 --query "config" --as-of-frame 100
memvid timeline knowledge.mv2 --since 2024-01-01

---

memvid state memory.mv2 "Alice"
# → {employer: "Anthropic", role: "Senior Engineer"}

---

{"entity": "John", "slot": "job_title", "value": "Senior Engineer", "kind": "fact"}

---

# Instead of appending to MEMORY.md
memvid put hermes.mv2 --text "User prefers concise responses"
# Retrieval
memvid find hermes.mv2 --query "user preference" --k 5

Summary

Hermes Memory Pain Points (Observed in Issues)

Issue #	Problem	Impact
#17251	Compaction demotion — long sessions lose context after compaction	Context loss
#17013	Model switch loss — memory state lost when switching models	State inconsistency
#20849	Truncation overwrite — aggressive truncation destroys useful history	Data loss
#16670	Compression fallback marker after incomplete chunked read	Long session fragility
#23717	RFC: Pluggable SessionDB Provider (PostgreSQL/MySQL)	Need for alternative backends

Root theme: Hermes' current memory system (text files + session_search) lacks structured indexing, semantic retrieval, and crash-safe persistence.

What is Memvid?

Memvid is a single-file AI memory layer (.mv2 format) written in Rust. It stores documents, indices, and a WAL in one append-only file — no sidecar files, no server process.

Key stats:

⭐ 15.4k stars, Apache-2.0 license
98.5% Rust, active development (last commit: 2026-05-06)
Setup: 145ms (vs Pinecone 7.4s, ChromaDB 2min)
Search latency: 24ms (vs Pinecone 267ms, LanceDB 506ms)
Storage: 4.9 MB for 1,000 docs

Why Memvid Fits Hermes

1. Single-File Portability

hermes.mv2  ←  data + BM25 index + vector index + time index + WAL

Copy, sync, git-commit friendly
No .wal, .shm, .lock sidecars
Crash recovery < 250ms via embedded WAL

2. Multi-Modal Search (Not Just Vectors)

Index	Engine	Use Case
Lexical (BM25)	Tantivy	Exact code/keyword search — zero embedding cost
Vector (HNSW)	ONNX local / API	Semantic similarity when needed
SimHash	64-bit LSH	Deduplication
Time Index	B-tree	"What did we discuss last Tuesday?"
Logic Mesh	Rules / LLM	Entity graph (Alice → works_at → Anthropic)

3. Optional Embeddings = Zero-Cost Baseline

Pure lexical search (--mode lex) requires no embedding model — works on 3.8G RAM servers
Built-in BGE-small (384-dim, 75MB) for semantic search if desired
Ollama local models supported for fully offline operation
Product Quantization compresses vectors 16×

4. Unique Features Hermes Lacks

Time-Travel Queries:

memvid find knowledge.mv2 --query "config" --as-of-frame 100
memvid timeline knowledge.mv2 --since 2024-01-01

Entity State (O(1) Lookup):

memvid state memory.mv2 "Alice"
# → {employer: "Anthropic", role: "Senior Engineer"}

Memory Cards (Structured Knowledge):

{"entity": "John", "slot": "job_title", "value": "Senior Engineer", "kind": "fact"}

Adaptive Retrieval — auto-detects score cliffs instead of fixed top-k.

5. Resource-Friendly for Self-Hosted Setups

Config	RAM	Notes
Lexical only	~0 MB extra	BM25 is CPU-only
BGE-small	+75 MB	Built-in, auto-download
Ollama + all-minilm	Local	Minimal GPU/CPU
Full vector + PQ	Configurable	16× compression

Proposed Integration Points

Option A: Memory Tool Backend (Minimal)

Replace memory tool's text-file backend with .mv2:

# Instead of appending to MEMORY.md
memvid put hermes.mv2 --text "User prefers concise responses"
# Retrieval
memvid find hermes.mv2 --query "user preference" --k 5

Option B: Session History Archive

Periodically archive session_search data into Memvid for:

Semantic search across all past sessions
Time-travel to specific conversation points
Entity extraction ("What projects has this user worked on?")

Option C: Full RAG Layer (Ambitious)

Use Memvid as the retrieval backend for Hermes' context assembly:

Hybrid BM25 + vector retrieval
Entity-aware context injection
Deduplication across sessions

Open Questions

Rust dependency: Memvid core is Rust. Hermes is Python/Node. Python SDK exists — is the FFI overhead acceptable?
Concurrency: Memvid uses file-level locking. How does this scale with Hermes' multi-turn async model?
Migration path: Can existing MEMORY.md / USER.md be imported into .mv2?
Cloud sync: Memvid Cloud (optional) vs Hermes' existing sync mechanisms.

Benchmark Data (Wikipedia 39K Docs)

System	Top-1 Accuracy	P95 Search	Cold Start
Memvid	92.72%	17.4ms	0.5ms
LanceDB	84.24%	28ms	150ms
Qdrant	84.24%	—	300ms
Weaviate	80.68%	—	—
Chroma	78.24%	61ms	—

Memvid achieves >90% accuracy with sub-20ms latency — a unique position in the accuracy-latency frontier.

References

Memvid GitHub: https://github.com/memvid/memvid
Documentation: https://docs.memvid.com/
Architecture: https://docs.memvid.com/architecture/overview
Benchmarks: https://docs.memvid.com/introduction/benchmarks
Memory Cards: https://docs.memvid.com/concepts/memory-cards

Next Steps

Seeking maintainer/community feedback on:

Is a pluggable memory backend aligned with Hermes' roadmap?
Which integration option (A/B/C) is most viable?
Should we proceed with a proof-of-concept PR?

/cc @teknium1 and memory-system contributors — would love your take on whether this complements or conflicts with existing memory work.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix RFC: Evaluate Memvid as a Pluggable Single-File Memory Backend for Hermes

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Hermes Memory Pain Points (Observed in Issues)

What is Memvid?

Why Memvid Fits Hermes

1. Single-File Portability

2. Multi-Modal Search (Not Just Vectors)

3. Optional Embeddings = Zero-Cost Baseline

4. Unique Features Hermes Lacks

5. Resource-Friendly for Self-Hosted Setups

Proposed Integration Points

Option A: Memory Tool Backend (Minimal)

Option B: Session History Archive

Option C: Full RAG Layer (Ambitious)

Open Questions

Benchmark Data (Wikipedia 39K Docs)

References

Next Steps

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix RFC: Evaluate Memvid as a Pluggable Single-File Memory Backend for Hermes

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Hermes Memory Pain Points (Observed in Issues)

What is Memvid?

Why Memvid Fits Hermes

1. Single-File Portability

2. Multi-Modal Search (Not Just Vectors)

3. Optional Embeddings = Zero-Cost Baseline

4. Unique Features Hermes Lacks

5. Resource-Friendly for Self-Hosted Setups

Proposed Integration Points

Option A: Memory Tool Backend (Minimal)

Option B: Session History Archive

Option C: Full RAG Layer (Ambitious)

Open Questions

Benchmark Data (Wikipedia 39K Docs)

References

Next Steps

Still need to ship something?

RELATED_DISCOVERY

TRENDING