openclaw - ✅(Solved) Fix [Feature]: expose vectorScore and textScore in hybrid search results [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#68166Fetched 2026-04-18 05:53:40
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
cross-referenced ×1labeled ×1referenced ×1

mergeHybridResults computes per-result vectorScore (cosine similarity) and textScore (BM25) but discards them when building the final result object, only the weighted combined score survives. Plugins hooking after_tool_call for memory_search receive the full result payload, including score, but have no way to determine whether relevance came from semantic similarity or keyword matching.

Root Cause

There is currently no ability to benchmark the quality of memory retrieval across embedding models, providers, and query patterns over time. The combined score alone doesn't tell you if a result ranked high because of strong vector similarity, strong keyword overlap, or a mix, which matters when evaluating whether to change embedding models, adjust hybrid weights, or diagnose why an agent is retrieving irrelevant context.

Fix Action

Fix / Workaround

Considered a manual patch, the issue is that the patch would be overridden during a version upgrade.

Affected: users Severity: blocks evaluation Frequency: every memory search Consequence: manual workaround

PR fix notes

PR #68286: feat(memory-core): expose vectorScore and textScore in hybrid search results

Description (problem / solution / changelog)

Summary

  • Problem: mergeHybridResults computes vectorScore and textScore per result but drops them when building the return object. Only the weighted combined score survives.
  • Why it matters: operators running multi-agent setups need to compare retrieval quality across embedding models and weight configs. The combined score hides which component drove the ranking.
  • What changed: two fields (vectorScore, textScore) now pass through the merge result in hybrid.ts. MemorySearchResult in memory-host-sdk adds them as optional fields.
  • What did NOT change: scoring math, temporal decay, MMR re-ranking, minScore filtering. The combined score field stays the primary ranking signal.

Change Type

  • Feature

Scope

  • Memory / storage
  • API / contracts

Linked Issue/PR

  • Closes #68166

Root Cause

N/A

Regression Test Plan

  • Coverage level:
    • Unit test
  • Target test: extensions/memory-core/src/memory/hybrid.test.ts
  • Scenario: existing mergeHybridResults tests now assert vectorScore and textScore on each result. Disjoint ids (one component is 0) and overlapping ids (both contribute).
  • Why smallest reliable: the merge function is pure. Unit tests cover the full branch space; no live gateway required.
  • Existing test that already covers this: hybrid.test.ts covered the combined score; six new assertions extend those same tests.

User-visible / Behavior Changes

  • MemorySearchResult adds optional vectorScore and textScore fields.
  • memory_search results include component scores when hybrid search is active. Consumers that don't read these fields see no difference.

Diagram

N/A

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: Ubuntu 24.04 (Linux 6.14)
  • Runtime: Node 22.22.0, pnpm 10.32.1
  • Model/provider: N/A (unit tests only)

Steps

  1. pnpm install
  2. node scripts/test-extension.mjs memory-core

Expected

hybrid.test.ts passes with vectorScore/textScore assertions.

Actual

46 passed, 1 pre-existing failure (dreaming-phases fixture paths, unrelated, same on main).

Evidence

  • Failing test/log before + passing after

Human Verification

  • Verified: full memory-core suite (47 files, 431 tests passing).
  • Edge cases: overlapping ids, disjoint ids, keyword-only results.
  • Not verified: live gateway end-to-end with a real embedding provider.

Review Conversations

  • N/A -- initial submission.

Compatibility / Migration

  • Backward compatible? Yes, new fields are optional.
  • Config/env changes? No
  • Migration needed? No

Disclosure

Claude Code helped me straighten out my word salad for this PR.

Changed files

  • extensions/memory-core/src/memory/hybrid.test.ts (modified, +6/-0)
  • extensions/memory-core/src/memory/hybrid.ts (modified, +4/-0)
  • packages/memory-host-sdk/src/host/types.ts (modified, +2/-0)
  • src/memory-host-sdk/host/types.ts (modified, +2/-0)
RAW_BUFFERClick to expand / collapse

Summary

mergeHybridResults computes per-result vectorScore (cosine similarity) and textScore (BM25) but discards them when building the final result object, only the weighted combined score survives. Plugins hooking after_tool_call for memory_search receive the full result payload, including score, but have no way to determine whether relevance came from semantic similarity or keyword matching.

Problem to solve

There is currently no ability to benchmark the quality of memory retrieval across embedding models, providers, and query patterns over time. The combined score alone doesn't tell you if a result ranked high because of strong vector similarity, strong keyword overlap, or a mix, which matters when evaluating whether to change embedding models, adjust hybrid weights, or diagnose why an agent is retrieving irrelevant context.

The data already exists at merge time; it's two fields that get dropped in mergeHybridResults before the result is returned.

Proposed solution

Carry vectorScore and textScore through the merge result and add them as optional fields on MemorySearchResult.

Alternatives considered

Considered a manual patch, the issue is that the patch would be overridden during a version upgrade.

Impact

Affected: users Severity: blocks evaluation Frequency: every memory search Consequence: manual workaround

Evidence/examples

No response

Additional information

No response

extent analysis

TL;DR

Carry vectorScore and textScore through the merge result and add them as optional fields on MemorySearchResult to enable benchmarking of memory retrieval quality.

Guidance

  • Modify the mergeHybridResults function to include vectorScore and textScore in the final result object.
  • Update the MemorySearchResult data structure to accommodate the additional fields.
  • Consider adding documentation to explain the meaning and usage of the new fields.
  • Evaluate the impact of adding these fields on existing plugins and workflows.

Example

// Example of updated MemorySearchResult with vectorScore and textScore
class MemorySearchResult {
  // ... existing fields ...
  vectorScore: number;
  textScore: number;
}

Notes

This solution assumes that the vectorScore and textScore values are already computed and available during the merge process. If this is not the case, additional calculations may be required.

Recommendation

Apply workaround: Carry vectorScore and textScore through the merge result and add them as optional fields on MemorySearchResult, as this allows for benchmarking of memory retrieval quality without requiring a version upgrade.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Feature]: expose vectorScore and textScore in hybrid search results [1 pull requests, 1 participants]