hermes - 💡(How to fix) Fix Verify retrieval fix on fresh unrelated queries

hermes2026-05-29 14:53:11

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Live verification after the #103 junk-classification fix and #104 relevance-ranking fix shows the system is behaving correctly on fresh, unrelated queries.

Root Cause

This is the difference between a memory system and a glorified last-write cache. The system now appears to answer the actual query instead of whatever it saw most recently.

RAW_BUFFERClick to expand / collapse

Summary

Live verification after the #103 junk-classification fix and #104 relevance-ranking fix shows the system is behaving correctly on fresh, unrelated queries.

What was tested

I intentionally avoided the same old repeat queries and used different prompts to check whether retrieval was still overfitting to previous topics.

Fresh query set

Portfolio site / readme-roulette
GCP billing / trial
HP Linux Mint SSH
SimpleFin Chase cards
Nonsense query: quantum blockchain NFT parrot recipes

Observed behavior

1) Topic-relevant queries now return topic-relevant context

Portfolio / readme-roulette returned the deployed web app context.
GCP billing / trial returned profile-level GCP context.
HP Linux Mint SSH returned the wiped media-server / local-network context.
SimpleFin Chase cards returned the card / transaction context.

2) Irrelevant queries degrade gracefully

The nonsense query did not return some stale default topic.
It reported fallback_reason: "no_topic_relevance_candidates".
lexical: 0
relevance_signals: []
current_state_only: true
current_state_only_penalty: 0.35

3) Ranking is no longer dominated by the most recent write

Before the fix, unrelated queries repeatedly surfaced the same recent VPN / JioHotstar facts. That failure mode did not appear in this test run.

Signals seen in the new retrieval path

These diagnostics were visible in the latest results:

lexical_hits
lexical_overlap
keyword_hits
relevance_signals
current_state_bias
current_state_only_penalty
fallback_reason

The key behavior change is that lexical relevance is now acting as a primary signal instead of recency/current-state dominating everything.

Verification result

Junk classification stays fixed: spam-like input remains quarantined / forensic-only.
Retrieval ranking fix appears stable on fresh, unrelated queries.
Graceful fallback for no-match queries is working.

Remaining note

The profile still contains some overrepresented VPN-era facts from the earlier test session, but that is a profiling / consolidation issue rather than a ranking failure.

Why this matters

This is the difference between a memory system and a glorified last-write cache. The system now appears to answer the actual query instead of whatever it saw most recently.

Suggested follow-up

If further tightening is needed, the next thing to inspect is profile consolidation / topic clustering so repeated historical topics do not remain overrepresented in the static profile view.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering