openclaw - 💡(How to fix) Fix Hybrid search BM25 component penalizes multimodal (image/audio) results [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#44540Fetched 2026-04-08 00:45:30
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Participants

When memorySearch.multimodal.enabled = true with Gemini embedding 2, image and audio files are properly indexed with valid embeddings. However, they never surface in memory_search results under the default hybrid search configuration.

Root Cause

Hybrid search computes: finalScore = vectorWeight × vectorScore + textWeight × textScore

Image/audio chunks have minimal text content (e.g., "Image file: generated/images/photo.png"), so their BM25 (text) score is near-zero for any natural language query. With default weights (0.7/0.3), the BM25 penalty is enough to push image results below text-only chunks that match both signals.

Fix Action

Workaround

Setting vectorWeight: 0.9, textWeight: 0.1 allows image results to surface (tested — images jump to #1 and #2 in results).

Code Example

if (chunk.modality === 'image' || chunk.modality === 'audio') {
  finalScore = vectorScore;  // BM25 is meaningless for binary content
} else {
  finalScore = vectorWeight * vectorScore + textWeight * textScore;
}
RAW_BUFFERClick to expand / collapse

Summary

When memorySearch.multimodal.enabled = true with Gemini embedding 2, image and audio files are properly indexed with valid embeddings. However, they never surface in memory_search results under the default hybrid search configuration.

Root Cause

Hybrid search computes: finalScore = vectorWeight × vectorScore + textWeight × textScore

Image/audio chunks have minimal text content (e.g., "Image file: generated/images/photo.png"), so their BM25 (text) score is near-zero for any natural language query. With default weights (0.7/0.3), the BM25 penalty is enough to push image results below text-only chunks that match both signals.

Reproduction

  1. Configure multimodal memory search with Gemini embedding 2
  2. Index image files via extraPaths
  3. Run memory_search with a query describing image content (e.g., "lobster and dolphin underwater cartoon")
  4. Observe: only markdown text results returned, zero images

Workaround

Setting vectorWeight: 0.9, textWeight: 0.1 allows image results to surface (tested — images jump to #1 and #2 in results).

Suggested Fix

The hybrid merge function should detect when a chunk's source modality is non-text (image/audio) and skip the BM25 component for those chunks, using vector score only. Something like:

if (chunk.modality === 'image' || chunk.modality === 'audio') {
  finalScore = vectorScore;  // BM25 is meaningless for binary content
} else {
  finalScore = vectorWeight * vectorScore + textWeight * textScore;
}

This would let multimodal and text results compete fairly without requiring users to weaken BM25 for text-on-text queries.

Environment

  • OpenClaw 2026.3.11
  • Provider: gemini, model: gemini-embedding-2-preview
  • 296 indexed files (169 .md, 100 images, 27 audio)
  • Default hybrid config (no custom weights) reproduces the issue

extent analysis

Fix Plan

To fix the issue, we need to modify the hybrid merge function to handle non-text chunks (image/audio) differently. Here are the steps:

  • Update the hybridMerge function to check the chunk's modality
  • If the modality is 'image' or 'audio', use only the vector score
  • Otherwise, use the default hybrid scoring formula

Example code:

function hybridMerge(chunk, vectorScore, textScore, vectorWeight, textWeight) {
  if (chunk.modality === 'image' || chunk.modality === 'audio') {
    return vectorScore;  // BM25 is meaningless for binary content
  } else {
    return vectorWeight * vectorScore + textWeight * textScore;
  }
}
  • Replace the existing hybridMerge function with the updated one
  • No changes are needed to the indexing or search query code

Verification

To verify the fix, follow these steps:

  • Index image and audio files using extraPaths
  • Run a memory_search query that describes image content (e.g., "lobster and dolphin underwater cartoon")
  • Check that image results are returned and ranked correctly
  • Test with different queries and modalities to ensure the fix is working as expected

Extra Tips

  • Make sure to update the hybridMerge function in the correct location, depending on your project's architecture
  • Consider adding logging or debugging statements to verify that the updated function is being called correctly
  • If you're using a version control system, create a new branch for the fix and test it thoroughly before merging it into the main branch.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Hybrid search BM25 component penalizes multimodal (image/audio) results [1 participants]