hermes - 💡(How to fix) Fix Feature: bundle findings_to_wiki — auto-populate memory + structured finding detection [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#20114Fetched 2026-05-06 06:38:41
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
labeled ×4renamed ×1

Bundle a lightweight findings_to_wiki memory provider that auto-populates MEMORY.md from conversation turns AND detects structured findings (analysis reports, decisions, key findings) for downstream curation.

Unlike existing providers (Honcho, Mem0, OpenViking), findings_to_wiki is not a storage backend — it's a content detection layer that feeds the built-in MEMORY.md.

Root Cause

The graph captures structural associations that no keyword search can match — not because of synonyms, but because terms that appear together in context get linked.

Fix Action

Fix / Workaround

The PPR search is now running as a registered Hermes Agent tool (assoc_search):

  • Source: tools/assoc_search_tool.py (280 lines)
  • Auto-discovered by the tool registry (no monkey-patching)
  • Lazy imports (scipy loaded only on call, not at agent startup)
  • Uses the existing ppr_matrix.npz built by the indexing pipeline
  • Falls back gracefully when matrix unavailable

Code Example

class FindingsToWikiProvider(MemoryProvider):
    def sync_turn(self, user_content, assistant_content, *, session_id=""):
        # 1. Save short fact to MEMORY.md
        fact = extract_fact(user_content, assistant_content)
        if fact:
            append_to_memory(fact)  # atomic write, dedup, trim to limit
        
        # 2. Detect structured findings via regex
        if len(assistant_content) > 100:
            ftype = detect_pattern(assistant_content, config.patterns)
            if ftype:
                save_finding(assistant_content, ftype)  # to configurable output dir
    
    def on_session_end(self, messages):
        # Flush last turn as [end-of-session] fact
        ...

---

session_search:
  fallback: ppr          # after FTS5 returns < N results, run PPR
  ppr_damping: 0.85
  ppr_iterations: 30
RAW_BUFFERClick to expand / collapse

Summary

Bundle a lightweight findings_to_wiki memory provider that auto-populates MEMORY.md from conversation turns AND detects structured findings (analysis reports, decisions, key findings) for downstream curation.

Unlike existing providers (Honcho, Mem0, OpenViking), findings_to_wiki is not a storage backend — it's a content detection layer that feeds the built-in MEMORY.md.

What it does

Two things in sync_turn():

1. Auto-populate MEMORY.md. After every user↔assistant exchange, extracts a short fact (timestamp + topic + key insight) and writes to ~/.hermes/memories/MEMORY.md. Skips trivial turns (thanks, ok, 👍) via a skip-list.

Without this, memory is ONLY populated by explicit memory() tool calls — the agent has to remember to save. With auto-populate, every turn potentially adds a fact.

2. Auto-detect structured findings. Patterns matched via configurable regex:

  • Prism analyses (## Findings, ## Conservation Law, ## Findings Table)
  • ADRs (## Decision, ## Status, ## Context)
  • Research reports (## Key Findings, ## Verification, ## Recommendations)

When 2+ markers match (min 100 chars threshold), the full analysis is saved for downstream curation (LLM Wiki, other KB).

Why now

The MemoryProvider ABC (agent/memory_provider.py) and its hooks (sync_turn, on_session_end) have been stable for a long time. Currently no bundled provider uses these hooks for content detection — they all focus on external storage.

The mem_to_file provider (the minimal built-in) dumps raw conversation history. findings_to_wiki is a natural upgrade: it writes to MEMORY.md directly (which is already injected into system prompt) and detects structure where mem_to_file sees blobs.

Proposed implementation

~150 lines of Python in plugins/memory/findings_to_wiki/__init__.py:

class FindingsToWikiProvider(MemoryProvider):
    def sync_turn(self, user_content, assistant_content, *, session_id=""):
        # 1. Save short fact to MEMORY.md
        fact = extract_fact(user_content, assistant_content)
        if fact:
            append_to_memory(fact)  # atomic write, dedup, trim to limit
        
        # 2. Detect structured findings via regex
        if len(assistant_content) > 100:
            ftype = detect_pattern(assistant_content, config.patterns)
            if ftype:
                save_finding(assistant_content, ftype)  # to configurable output dir
    
    def on_session_end(self, messages):
        # Flush last turn as [end-of-session] fact
        ...

Key design decisions:

  • Atomic writes via tempfile + os.replace (already in mem_to_file)
  • Pattern configurable via memory.findings_to_wiki.patterns in config.yaml
  • Threshold: 2+ matches, min 100 chars → avoids false positives on casual mentions
  • sync_turn receives only text (no tool results) → sufficient for content detection
  • on_session_end receives full message history → can flush remaining context

Comparison with existing providers

AspectHoncho/Mem0findings_to_wiki
StorageExternal DBMEMORY.md (built-in)
ScopeCross-session user modelingConversation auto-logging
LLM cost~$0.01-0.10/turn$0 (regex-only)
API neededHoncho/Mem0 cloud or self-hostNone
SetupAPI key + pip install0 deps

Next steps

Happy to submit a PR with the cleaned-up implementation. The current prototype (323 lines, includes wiki-specific code) needs ~150 lines trimmed — happy to do that and make it config-friendly.


Context: I built this for my own setup after getting tired of manually calling memory(). The finding detection was a side effect that turned out more useful than expected — it catches structured analysis that would otherwise be lost between sessions.

Future work: associative session search via co-occurrence graph

Beyond auto-populating memory, there's a complementary problem: finding sessions by meaning, not by keywords.

The gap

session_search (FTS5) is exact-match. It finds "RSI" if you search "RSI". But it cannot find RSI-related sessions when you search:

What user typesWhat they meanFTS5 resultWhat's missing
перекупленность (overbought)«найди сессии где обсуждали RSI >70»❌ 0 resultsRSI sessions tagged with different vocabulary
деплой сломался«найди похожие инциденты»❌ depends on exact wordingPast incident patterns use different terms
ELK stack«логи, мониторинг, дашборды»❌ unless docs literally say ELKConceptual search across documents
клиент жалуется«проблемы с договором, сроки, оплата»❌ partialFragmented across CRM, email, meeting notes

How co-occurrence PPR solves this

If session A mentions "RSI" + "перекупленность" + "индикатор", and session B mentions "индикатор" + "MACD" + "сигнал", then:

  • Search перекупленность → FTS5 finds only session A
  • PPR on the co-occurrence graph: перекупленностьиндикаторMACDsession B also ranked

The graph captures structural associations that no keyword search can match — not because of synonyms, but because terms that appear together in context get linked.

Prototype exists

Our hipporag-lite.py (~75 lines of core PPR logic on top of scipy sparse matrices) builds a co-occurrence graph from sessions and runs Personal PageRank for retrieval. Key metrics:

MetricValue
PPR matrix size11K×11K nodes, 608K edges (column-stochastic, 2.3MB on disk)
Query time~18ms (30 PPR iterations on sparse matrix)
Index time~60s full rebuild (not yet incremental)
Dependenciesscipy + numpy (already in most environments)
LLM cost$0

The approach is lightweight enough to run as an optional enhancement to session_search — no API calls, no GPU, no external services.

What this would look like upstream

A config flag on session_search:

session_search:
  fallback: ppr          # after FTS5 returns < N results, run PPR
  ppr_damping: 0.85
  ppr_iterations: 30

Or as a separate MemoryProvider that provides associative retrieval alongside the existing FTS5 index. Happy to outline the architecture in a dedicated RFC if there's interest.

Update 2026-05-05: working prototype as a Hermes Agent tool

The PPR search is now running as a registered Hermes Agent tool (assoc_search):

  • Source: tools/assoc_search_tool.py (280 lines)
  • Auto-discovered by the tool registry (no monkey-patching)
  • Lazy imports (scipy loaded only on call, not at agent startup)
  • Uses the existing ppr_matrix.npz built by the indexing pipeline
  • Falls back gracefully when matrix unavailable

The tool is registered under the search toolset alongside session_search. The agent can call it explicitly or via a workflow rule: "if session_search returns <3 results, call assoc_search with the same query."

Complete implementation would upstream this as an optional extension to session_search_tool.py — adding a ppr_fallback: true config flag that transparently runs PPR after FTS5.

extent analysis

TL;DR

The proposed findings_to_wiki memory provider can be implemented by creating a Python class that inherits from MemoryProvider and overrides the sync_turn method to auto-populate MEMORY.md and detect structured findings via regex.

Guidance

  • Implement the FindingsToWikiProvider class with the sync_turn method that extracts a short fact from user and assistant content and appends it to MEMORY.md.
  • Use regex patterns to detect structured findings in assistant content and save them to a configurable output directory.
  • Configure the memory.findings_to_wiki.patterns setting in config.yaml to define the regex patterns for finding detection.
  • Consider implementing the on_session_end method to flush the last turn as an end-of-session fact.

Example

class FindingsToWikiProvider(MemoryProvider):
    def sync_turn(self, user_content, assistant_content, *, session_id=""):
        fact = extract_fact(user_content, assistant_content)
        if fact:
            append_to_memory(fact)
        if len(assistant_content) > 100:
            ftype = detect_pattern(assistant_content, config.patterns)
            if ftype:
                save_finding(assistant_content, ftype)

Notes

The implementation should be done in the plugins/memory/findings_to_wiki/__init__.py file and should be around 150 lines of Python code. The sync_turn method should receive only text (no tool results) and should be sufficient for content detection.

Recommendation

Apply the workaround by implementing the FindingsToWikiProvider class and configuring the regex patterns in config.yaml. This will allow for auto-population of MEMORY.md and detection of structured findings without requiring external storage or APIs.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Feature: bundle findings_to_wiki — auto-populate memory + structured finding detection [1 participants]