hermes - 💡(How to fix) Fix feat(memory): Add vector/semantic search + memory lifecycle management to built-in memory tool

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Code Example

tools/memory_tool.py
  ├── MemoryStore (existing — MEMORY.md/USER.md)
  ├── VectorStore (new — sqlite-vec + ONNX, optional)
  │   ├── embed(text)384-dim vector
  │   ├── search(query, top_k=5) → hybrid
  │   └── lifecycle() — dedup, evolve, expire, compact
  └── memory_tool() handler — new action types:
      ├── search (existing) → hybrid if vector enabled
      ├── checkpoint (new)
      ├── resume (new)
      └── compact (new)
RAW_BUFFERClick to expand / collapse

Problem

Hermes' built-in memory tool is currently FTS5-only — it matches exact keywords but cannot retrieve semantically similar content when phrasing differs. Two practical pain points:

  1. Missed recall. "user prefers concise responses" won't surface when searching "keep answers short" or "be brief".
  2. No lifecycle. No automated dedup, evolution, TTL, or compaction — every manual memory(action=add) just appends to a ~2KB-limited buffer.

What OMEGA Memory does differently

omega-memory/omega-memory is Apache 2.0, local-first, SQLite + ONNX embeddings. Key gaps vs Hermes built-in:

CapabilityHermes built-in (current)OMEGA
Semantic search❌ FTS5 only✅ SQLite-vec + FTS5 hybrid
Memory lifecycle❌ Manual✅ Auto-dedup, TTL, evolution, compaction
Cross-session checkpoint✅ checkpoint/resume_task
Session-start briefing✅ welcome tool
Memory compression✅ cluster-and-summarize

Proposal

1. Optional vector embeddings (opt-in, non-breaking)

  • config.yaml: new [memory] section with vector_store: true/false (default: false)
  • ONNX model: bge-small-en-v1.5 (384-dim, ~30MB, ~300MB RSS on first query — same pattern as existing Hermes tool caches)
  • Storage: ~/.hermes/memories/ gets vectors.db (sqlite-vec) alongside existing MEMORY.md/USER.md
  • Hybrid retrieval: vector cosine + FTS5 BM25 → type-weighted merge → dedup
  • Memory types: tag entries as decision | lesson | preference | fact | summary

2. Memory lifecycle automation

  • Dedup on write: SHA256 + vector similarity threshold (configurable, default 0.85)
  • Evolution: similar memories (0.55-0.85) append insights instead of duplicate
  • TTL: session summaries expire 1 day; lessons/preferences permanent
  • Consolidation tool: memory(action=consolidate) — dedup + stale prune + compact

3. Cross-session continuity

  • Checkpoint: memory(action=checkpoint, task_state=...) — save task context
  • Resume: memory(action=resume_task) — surface saved checkpoints
  • Welcome: auto-surface N relevant memories on session start (vector-powered)

4. Memory compression

  • memory(action=compact) — cluster related memories, summarize each cluster, archive originals
  • Helps stay within the 2KB system prompt budget

Why built-in vs plugin

8 memory providers already exist (plugins/memory/ — Mem0, Supermemory, Hindsight, Holographic, etc.) but:

  • Default gap: most users never enable plugins. The built-in memory tool is first interaction.
  • Latency: plugins go through MemoryProvider abstraction layer.
  • Dependency tax: plugins bring auth/models/services. A built-in vector layer with Hermes' local-first philosophy = zero extra deps.

Implementation sketch

tools/memory_tool.py
  ├── MemoryStore (existing — MEMORY.md/USER.md)
  ├── VectorStore (new — sqlite-vec + ONNX, optional)
  │   ├── embed(text) → 384-dim vector
  │   ├── search(query, top_k=5) → hybrid
  │   └── lifecycle() — dedup, evolve, expire, compact
  └── memory_tool() handler — new action types:
      ├── search (existing) → hybrid if vector enabled
      ├── checkpoint (new)
      ├── resume (new)
      └── compact (new)

Prior art in Hermes

  • 8 memory provider plugins already exist with similar architectures
  • ONNX model caches already used for other tools
  • Current FTS5 dedup in memory_tool.py — extend with hash+vector

References

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING