openclaw - 💡(How to fix) Fix [Bug]: memory_search textScore=0 for session transcript chunks — FTS never runs when KNN candidate pool misses dialogue-format chunks [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#80399Fetched 2026-05-11 03:15:07
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Author
Timeline (top)
closed ×1commented ×1

When experimental.sessionMemory is enabled, memory_search consistently returns textScore: 0 for session transcript chunks — even when a direct FTS5 query on the same SQLite database finds those chunks immediately with strong BM25 scores.

The result is that topic-specific queries against past sessions silently fail. The default minScore threshold filters out results entirely, making the feature appear broken.

Root Cause

After inspecting the SQLite index directly, the architecture appears to be sequential, not parallel:

  1. Vector KNN runs first against chunks_vec, producing a candidate pool
  2. FTS BM25 re-ranks only those candidates
  3. If session chunks don't appear in the KNN pool, FTS never sees them → textScore: 0

Session transcript chunks are stored as long User: .../Assistant: ... dialogue blobs (~900–1800 chars). In embedding space, a short factual query phrase sits at cosine distance ~0.65–0.78 from dialogue chunks containing that exact phrase verbatim. This is true even with text-embedding-3-large at 3072 dims. The chunks form tight intra-session clusters (dist ~0.08 between siblings) but those clusters are geometrically far from short query vectors.

The result: session chunks never make the KNN cutoff, FTS re-ranking never fires, textScore: 0.

Fix Action

Workaround

Dreaming (memory-core) partially mitigates this — distilled prose promoted to MEMORY.md embeds well and scores correctly. But raw session indexing remains unreliable for topic-specific recall.

Code Example

# Config required:
# agents.defaults.memorySearch.provider = "openai"
# agents.defaults.memorySearch.experimental.sessionMemory = true
# agents.defaults.memorySearch.sources = ["memory", "sessions"]

# 1. Have a conversation about a specific topic (e.g. "What is the fuel economy of a Prius C?")
# 2. In a later session, run memory_search on that topic — returns empty:
#    memory_search("Prius C fuel economy mpg"){ results: [], hits: 0 }
#    even though the session is indexed and contains the exact phrase

# 3. Direct SQLite FTS query finds it immediately:
sqlite3 ~/.openclaw/memory/main.sqlite \
  "SELECT c.path, bm25(chunks_fts) as score
     FROM chunks_fts
     JOIN chunks c ON chunks_fts.rowid = c.rowid
     WHERE chunks_fts MATCH 'prius fuel economy'
     ORDER BY score LIMIT 5"
# Returns 3 hits with strong BM25 scores (e.g. -16.38, -13.90, -12.98)
# from the session that discussed Prius fuel economy

---

rowid=131 source=sessions bm25=-37.46  path=44e28424.jsonlPrius Q&A chunk, BEST match
rowid=132 source=sessions bm25=-33.57
rowid=133 source=sessions bm25=-29.22

---

mag = sqrt(sum(x*x for x in embedding))  # → 1.000000 for all chunks

---

rank 1  dist=0.0000  sim=1.0000  rowid=131  sessions/44e28424.jsonl  ← itself
rank 2  dist=0.0688  sim=0.9312  rowid=132  sessions/44e28424.jsonl  ← sibling chunk
rank 3  dist=0.1401  sim=0.8599  rowid=133  sessions/44e28424.jsonl  ← sibling chunk
rank 4  dist=0.4978  sim=0.5022  rowid=346  sessions/current_session  ← large gap

---

{ "vectorScore": 0.358, "textScore": 0, "score": 0.250 }
RAW_BUFFERClick to expand / collapse

Environment

  • OpenClaw: 2026.5.7 (eeef486)
  • Node.js: v22.22.2
  • OS: Linux 6.12.57+deb13-amd64 (Debian, x64)
  • Memory backend: builtin (sqlite-vec)
  • Embedding provider: OpenAI text-embedding-3-large (3072 dims); also reproduced with GitHub Copilot text-embedding-3-small (1536 dims)
  • Session memory: memorySearch.experimental.sessionMemory: true, sources: ["memory", "sessions"]

Description

When experimental.sessionMemory is enabled, memory_search consistently returns textScore: 0 for session transcript chunks — even when a direct FTS5 query on the same SQLite database finds those chunks immediately with strong BM25 scores.

The result is that topic-specific queries against past sessions silently fail. The default minScore threshold filters out results entirely, making the feature appear broken.

Root Cause

After inspecting the SQLite index directly, the architecture appears to be sequential, not parallel:

  1. Vector KNN runs first against chunks_vec, producing a candidate pool
  2. FTS BM25 re-ranks only those candidates
  3. If session chunks don't appear in the KNN pool, FTS never sees them → textScore: 0

Session transcript chunks are stored as long User: .../Assistant: ... dialogue blobs (~900–1800 chars). In embedding space, a short factual query phrase sits at cosine distance ~0.65–0.78 from dialogue chunks containing that exact phrase verbatim. This is true even with text-embedding-3-large at 3072 dims. The chunks form tight intra-session clusters (dist ~0.08 between siblings) but those clusters are geometrically far from short query vectors.

The result: session chunks never make the KNN cutoff, FTS re-ranking never fires, textScore: 0.

Reproduction

# Config required:
# agents.defaults.memorySearch.provider = "openai"
# agents.defaults.memorySearch.experimental.sessionMemory = true
# agents.defaults.memorySearch.sources = ["memory", "sessions"]

# 1. Have a conversation about a specific topic (e.g. "What is the fuel economy of a Prius C?")
# 2. In a later session, run memory_search on that topic — returns empty:
#    memory_search("Prius C fuel economy mpg") → { results: [], hits: 0 }
#    even though the session is indexed and contains the exact phrase

# 3. Direct SQLite FTS query finds it immediately:
sqlite3 ~/.openclaw/memory/main.sqlite \
  "SELECT c.path, bm25(chunks_fts) as score
     FROM chunks_fts
     JOIN chunks c ON chunks_fts.rowid = c.rowid
     WHERE chunks_fts MATCH 'prius fuel economy'
     ORDER BY score LIMIT 5"
# Returns 3 hits with strong BM25 scores (e.g. -16.38, -13.90, -12.98)
# from the session that discussed Prius fuel economy

Evidence

FTS works fine — raw SQL confirms chunks are indexed and keyword-searchable:

rowid=131 source=sessions bm25=-37.46  path=44e28424.jsonl  ← Prius Q&A chunk, BEST match
rowid=132 source=sessions bm25=-33.57
rowid=133 source=sessions bm25=-29.22

Embeddings exist and are correctly normalized — all 330 chunks have unit-vector embeddings in both chunks.embedding (JSON) and chunks_vec (float32 binary):

mag = sqrt(sum(x*x for x in embedding))  # → 1.000000 for all chunks

KNN using the chunk's own vector finds it at rank 1 — confirming vec table integrity:

rank 1  dist=0.0000  sim=1.0000  rowid=131  sessions/44e28424.jsonl  ← itself
rank 2  dist=0.0688  sim=0.9312  rowid=132  sessions/44e28424.jsonl  ← sibling chunk
rank 3  dist=0.1401  sim=0.8599  rowid=133  sessions/44e28424.jsonl  ← sibling chunk
rank 4  dist=0.4978  sim=0.5022  rowid=346  sessions/current_session  ← large gap

But the actual query vector lands far from the chunk — calling memory_search with minScore=0 returns only ~0.25–0.36 vectorScore for session chunks, with textScore=0:

{ "vectorScore": 0.358, "textScore": 0, "score": 0.250 }

The ~0.50 cosine distance gap between the Prius session cluster and other session content confirms that session chunks are geometrically isolated — a query must land very close to the right cluster to make the KNN pool.

Expected Behavior

FTS should run in parallel with vector KNN (or at minimum with a broader candidate pool), so that keyword-strong chunks can surface even when the query vector doesn't land near them. A chunk that scores -37 BM25 on the exact query terms should not return textScore=0.

This is especially important for session transcript chunks, which are inherently in a different embedding-space region than short factual queries due to their conversational format.

Workaround

Dreaming (memory-core) partially mitigates this — distilled prose promoted to MEMORY.md embeds well and scores correctly. But raw session indexing remains unreliable for topic-specific recall.

Related Issues

  • #48711 — broad memory recall reliability
  • #51386 — graduate sessionMemory from experimental

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: memory_search textScore=0 for session transcript chunks — FTS never runs when KNN candidate pool misses dialogue-format chunks [1 comments, 2 participants]