hermes - 💡(How to fix) Fix Feature: bundle findings_to_wiki — auto-populate memory + structured finding detection [1 participants]

hermes2026-05-05 08:34:58

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#20114•Fetched 2026-05-06 06:38:41

View on GitHub

Comments

Participants

Timeline

Reactions

Author

NikolayGusev-astra

Participants

NikolayGusev-astra

Timeline (top)

labeled ×4renamed ×1

Bundle a lightweight findings_to_wiki memory provider that auto-populates MEMORY.md from conversation turns AND detects structured findings (analysis reports, decisions, key findings) for downstream curation.

Unlike existing providers (Honcho, Mem0, OpenViking), findings_to_wiki is not a storage backend — it's a content detection layer that feeds the built-in MEMORY.md.

Root Cause

The graph captures structural associations that no keyword search can match — not because of synonyms, but because terms that appear together in context get linked.

Fix Action

Fix / Workaround

The PPR search is now running as a registered Hermes Agent tool (assoc_search):

Source: tools/assoc_search_tool.py (280 lines)
Auto-discovered by the tool registry (no monkey-patching)
Lazy imports (scipy loaded only on call, not at agent startup)
Uses the existing ppr_matrix.npz built by the indexing pipeline
Falls back gracefully when matrix unavailable

Code Example

class FindingsToWikiProvider(MemoryProvider):
    def sync_turn(self, user_content, assistant_content, *, session_id=""):
        # 1. Save short fact to MEMORY.md
        fact = extract_fact(user_content, assistant_content)
        if fact:
            append_to_memory(fact)  # atomic write, dedup, trim to limit
        
        # 2. Detect structured findings via regex
        if len(assistant_content) > 100:
            ftype = detect_pattern(assistant_content, config.patterns)
            if ftype:
                save_finding(assistant_content, ftype)  # to configurable output dir
    
    def on_session_end(self, messages):
        # Flush last turn as [end-of-session] fact
        ...

---

session_search:
  fallback: ppr          # after FTS5 returns < N results, run PPR
  ppr_damping: 0.85
  ppr_iterations: 30

RAW_BUFFERClick to expand / collapse

Summary

Unlike existing providers (Honcho, Mem0, OpenViking), findings_to_wiki is not a storage backend — it's a content detection layer that feeds the built-in MEMORY.md.

What it does

Two things in sync_turn():

1. Auto-populate MEMORY.md. After every user↔assistant exchange, extracts a short fact (timestamp + topic + key insight) and writes to ~/.hermes/memories/MEMORY.md. Skips trivial turns (thanks, ok, 👍) via a skip-list.

Without this, memory is ONLY populated by explicit memory() tool calls — the agent has to remember to save. With auto-populate, every turn potentially adds a fact.

2. Auto-detect structured findings. Patterns matched via configurable regex:

Prism analyses (## Findings, ## Conservation Law, ## Findings Table)
ADRs (## Decision, ## Status, ## Context)
Research reports (## Key Findings, ## Verification, ## Recommendations)

When 2+ markers match (min 100 chars threshold), the full analysis is saved for downstream curation (LLM Wiki, other KB).

Why now

The MemoryProvider ABC (agent/memory_provider.py) and its hooks (sync_turn, on_session_end) have been stable for a long time. Currently no bundled provider uses these hooks for content detection — they all focus on external storage.

The mem_to_file provider (the minimal built-in) dumps raw conversation history. findings_to_wiki is a natural upgrade: it writes to MEMORY.md directly (which is already injected into system prompt) and detects structure where mem_to_file sees blobs.

Proposed implementation

~150 lines of Python in plugins/memory/findings_to_wiki/__init__.py:

class FindingsToWikiProvider(MemoryProvider):
    def sync_turn(self, user_content, assistant_content, *, session_id=""):
        # 1. Save short fact to MEMORY.md
        fact = extract_fact(user_content, assistant_content)
        if fact:
            append_to_memory(fact)  # atomic write, dedup, trim to limit
        
        # 2. Detect structured findings via regex
        if len(assistant_content) > 100:
            ftype = detect_pattern(assistant_content, config.patterns)
            if ftype:
                save_finding(assistant_content, ftype)  # to configurable output dir
    
    def on_session_end(self, messages):
        # Flush last turn as [end-of-session] fact
        ...

Key design decisions:

Atomic writes via tempfile + os.replace (already in mem_to_file)
Pattern configurable via memory.findings_to_wiki.patterns in config.yaml
Threshold: 2+ matches, min 100 chars → avoids false positives on casual mentions
sync_turn receives only text (no tool results) → sufficient for content detection
on_session_end receives full message history → can flush remaining context

Comparison with existing providers

Aspect	Honcho/Mem0	findings_to_wiki
Storage	External DB	MEMORY.md (built-in)
Scope	Cross-session user modeling	Conversation auto-logging
LLM cost	~$0.01-0.10/turn	$0 (regex-only)
API needed	Honcho/Mem0 cloud or self-host	None
Setup	API key + pip install	0 deps

Next steps

Happy to submit a PR with the cleaned-up implementation. The current prototype (323 lines, includes wiki-specific code) needs ~150 lines trimmed — happy to do that and make it config-friendly.

Context: I built this for my own setup after getting tired of manually calling memory(). The finding detection was a side effect that turned out more useful than expected — it catches structured analysis that would otherwise be lost between sessions.

Future work: associative session search via co-occurrence graph

Beyond auto-populating memory, there's a complementary problem: finding sessions by meaning, not by keywords.

The gap

session_search (FTS5) is exact-match. It finds "RSI" if you search "RSI". But it cannot find RSI-related sessions when you search:

What user types	What they mean	FTS5 result	What's missing
`перекупленность` (overbought)	«найди сессии где обсуждали RSI >70»	❌ 0 results	RSI sessions tagged with different vocabulary
`деплой сломался`	«найди похожие инциденты»	❌ depends on exact wording	Past incident patterns use different terms
`ELK stack`	«логи, мониторинг, дашборды»	❌ unless docs literally say ELK	Conceptual search across documents
`клиент жалуется`	«проблемы с договором, сроки, оплата»	❌ partial	Fragmented across CRM, email, meeting notes

How co-occurrence PPR solves this

If session A mentions "RSI" + "перекупленность" + "индикатор", and session B mentions "индикатор" + "MACD" + "сигнал", then:

Search перекупленность → FTS5 finds only session A
PPR on the co-occurrence graph: перекупленность → индикатор → MACD → session B also ranked

The graph captures structural associations that no keyword search can match — not because of synonyms, but because terms that appear together in context get linked.

Prototype exists

Our hipporag-lite.py (~75 lines of core PPR logic on top of scipy sparse matrices) builds a co-occurrence graph from sessions and runs Personal PageRank for retrieval. Key metrics:

Metric	Value
PPR matrix size	11K×11K nodes, 608K edges (column-stochastic, 2.3MB on disk)
Query time	~18ms (30 PPR iterations on sparse matrix)
Index time	~60s full rebuild (not yet incremental)
Dependencies	scipy + numpy (already in most environments)
LLM cost	$0

The approach is lightweight enough to run as an optional enhancement to session_search — no API calls, no GPU, no external services.

What this would look like upstream

A config flag on session_search:

session_search:
  fallback: ppr          # after FTS5 returns < N results, run PPR
  ppr_damping: 0.85
  ppr_iterations: 30

Or as a separate MemoryProvider that provides associative retrieval alongside the existing FTS5 index. Happy to outline the architecture in a dedicated RFC if there's interest.

Update 2026-05-05: working prototype as a Hermes Agent tool

The PPR search is now running as a registered Hermes Agent tool (assoc_search):

Source: tools/assoc_search_tool.py (280 lines)
Auto-discovered by the tool registry (no monkey-patching)
Lazy imports (scipy loaded only on call, not at agent startup)
Uses the existing ppr_matrix.npz built by the indexing pipeline
Falls back gracefully when matrix unavailable

The tool is registered under the search toolset alongside session_search. The agent can call it explicitly or via a workflow rule: "if session_search returns <3 results, call assoc_search with the same query."

Complete implementation would upstream this as an optional extension to session_search_tool.py — adding a ppr_fallback: true config flag that transparently runs PPR after FTS5.

extent analysis

TL;DR

The proposed findings_to_wiki memory provider can be implemented by creating a Python class that inherits from MemoryProvider and overrides the sync_turn method to auto-populate MEMORY.md and detect structured findings via regex.

Guidance

Implement the FindingsToWikiProvider class with the sync_turn method that extracts a short fact from user and assistant content and appends it to MEMORY.md.
Use regex patterns to detect structured findings in assistant content and save them to a configurable output directory.
Configure the memory.findings_to_wiki.patterns setting in config.yaml to define the regex patterns for finding detection.
Consider implementing the on_session_end method to flush the last turn as an end-of-session fact.

Example

class FindingsToWikiProvider(MemoryProvider):
    def sync_turn(self, user_content, assistant_content, *, session_id=""):
        fact = extract_fact(user_content, assistant_content)
        if fact:
            append_to_memory(fact)
        if len(assistant_content) > 100:
            ftype = detect_pattern(assistant_content, config.patterns)
            if ftype:
                save_finding(assistant_content, ftype)

Notes

The implementation should be done in the plugins/memory/findings_to_wiki/__init__.py file and should be around 150 lines of Python code. The sync_turn method should receive only text (no tool results) and should be sufficient for content detection.

Recommendation

Apply the workaround by implementing the FindingsToWikiProvider class and configuring the regex patterns in config.yaml. This will allow for auto-population of MEMORY.md and detection of structured findings without requiring external storage or APIs.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #conversation history #serialization error #model compatibility #GPU setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Feature: bundle findings_to_wiki — auto-populate memory + structured finding detection [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

What it does

Why now

Proposed implementation

Comparison with existing providers

Next steps

Future work: associative session search via co-occurrence graph

The gap

How co-occurrence PPR solves this

Prototype exists

What this would look like upstream

Update 2026-05-05: working prototype as a Hermes Agent tool

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Feature: bundle findings_to_wiki — auto-populate memory + structured finding detection [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

What it does

Why now

Proposed implementation

Comparison with existing providers

Next steps

Future work: associative session search via co-occurrence graph

The gap

How co-occurrence PPR solves this

Prototype exists

What this would look like upstream

Update 2026-05-05: working prototype as a Hermes Agent tool

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING