openclaw - 💡(How to fix) Fix [Bug]: memory_search textScore=0 for session transcript chunks — FTS never runs when KNN candidate pool misses dialogue-format chunks [1 comments, 2 participants]

openclaw2026-05-10 19:20:40

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#80399•Fetched 2026-05-11 03:15:07

View on GitHub

Comments

Participants

Timeline

Reactions

Author

rqlangley

Participants

clawsweeper[bot]

rqlangley

Timeline (top)

closed ×1commented ×1

When experimental.sessionMemory is enabled, memory_search consistently returns textScore: 0 for session transcript chunks — even when a direct FTS5 query on the same SQLite database finds those chunks immediately with strong BM25 scores.

The result is that topic-specific queries against past sessions silently fail. The default minScore threshold filters out results entirely, making the feature appear broken.

Root Cause

After inspecting the SQLite index directly, the architecture appears to be sequential, not parallel:

Vector KNN runs first against chunks_vec, producing a candidate pool
FTS BM25 re-ranks only those candidates
If session chunks don't appear in the KNN pool, FTS never sees them → textScore: 0

Session transcript chunks are stored as long User: .../Assistant: ... dialogue blobs (~900–1800 chars). In embedding space, a short factual query phrase sits at cosine distance ~0.65–0.78 from dialogue chunks containing that exact phrase verbatim. This is true even with text-embedding-3-large at 3072 dims. The chunks form tight intra-session clusters (dist ~0.08 between siblings) but those clusters are geometrically far from short query vectors.

The result: session chunks never make the KNN cutoff, FTS re-ranking never fires, textScore: 0.

Fix Action

Workaround

Dreaming (memory-core) partially mitigates this — distilled prose promoted to MEMORY.md embeds well and scores correctly. But raw session indexing remains unreliable for topic-specific recall.

Code Example

# Config required:
# agents.defaults.memorySearch.provider = "openai"
# agents.defaults.memorySearch.experimental.sessionMemory = true
# agents.defaults.memorySearch.sources = ["memory", "sessions"]

# 1. Have a conversation about a specific topic (e.g. "What is the fuel economy of a Prius C?")
# 2. In a later session, run memory_search on that topic — returns empty:
#    memory_search("Prius C fuel economy mpg") → { results: [], hits: 0 }
#    even though the session is indexed and contains the exact phrase

# 3. Direct SQLite FTS query finds it immediately:
sqlite3 ~/.openclaw/memory/main.sqlite \
  "SELECT c.path, bm25(chunks_fts) as score
     FROM chunks_fts
     JOIN chunks c ON chunks_fts.rowid = c.rowid
     WHERE chunks_fts MATCH 'prius fuel economy'
     ORDER BY score LIMIT 5"
# Returns 3 hits with strong BM25 scores (e.g. -16.38, -13.90, -12.98)
# from the session that discussed Prius fuel economy

---

rowid=131 source=sessions bm25=-37.46  path=44e28424.jsonl  ← Prius Q&A chunk, BEST match
rowid=132 source=sessions bm25=-33.57
rowid=133 source=sessions bm25=-29.22

---

mag = sqrt(sum(x*x for x in embedding))  # → 1.000000 for all chunks

---

rank 1  dist=0.0000  sim=1.0000  rowid=131  sessions/44e28424.jsonl  ← itself
rank 2  dist=0.0688  sim=0.9312  rowid=132  sessions/44e28424.jsonl  ← sibling chunk
rank 3  dist=0.1401  sim=0.8599  rowid=133  sessions/44e28424.jsonl  ← sibling chunk
rank 4  dist=0.4978  sim=0.5022  rowid=346  sessions/current_session  ← large gap

---

{ "vectorScore": 0.358, "textScore": 0, "score": 0.250 }

RAW_BUFFERClick to expand / collapse

Environment

OpenClaw: 2026.5.7 (eeef486)
Node.js: v22.22.2
OS: Linux 6.12.57+deb13-amd64 (Debian, x64)
Memory backend: builtin (sqlite-vec)
Embedding provider: OpenAI text-embedding-3-large (3072 dims); also reproduced with GitHub Copilot text-embedding-3-small (1536 dims)
Session memory: memorySearch.experimental.sessionMemory: true, sources: ["memory", "sessions"]

Description

The result is that topic-specific queries against past sessions silently fail. The default minScore threshold filters out results entirely, making the feature appear broken.

Root Cause

After inspecting the SQLite index directly, the architecture appears to be sequential, not parallel:

Vector KNN runs first against chunks_vec, producing a candidate pool
FTS BM25 re-ranks only those candidates
If session chunks don't appear in the KNN pool, FTS never sees them → textScore: 0

The result: session chunks never make the KNN cutoff, FTS re-ranking never fires, textScore: 0.

Reproduction

# Config required:
# agents.defaults.memorySearch.provider = "openai"
# agents.defaults.memorySearch.experimental.sessionMemory = true
# agents.defaults.memorySearch.sources = ["memory", "sessions"]

# 1. Have a conversation about a specific topic (e.g. "What is the fuel economy of a Prius C?")
# 2. In a later session, run memory_search on that topic — returns empty:
#    memory_search("Prius C fuel economy mpg") → { results: [], hits: 0 }
#    even though the session is indexed and contains the exact phrase

# 3. Direct SQLite FTS query finds it immediately:
sqlite3 ~/.openclaw/memory/main.sqlite \
  "SELECT c.path, bm25(chunks_fts) as score
     FROM chunks_fts
     JOIN chunks c ON chunks_fts.rowid = c.rowid
     WHERE chunks_fts MATCH 'prius fuel economy'
     ORDER BY score LIMIT 5"
# Returns 3 hits with strong BM25 scores (e.g. -16.38, -13.90, -12.98)
# from the session that discussed Prius fuel economy

Evidence

FTS works fine — raw SQL confirms chunks are indexed and keyword-searchable:

rowid=131 source=sessions bm25=-37.46  path=44e28424.jsonl  ← Prius Q&A chunk, BEST match
rowid=132 source=sessions bm25=-33.57
rowid=133 source=sessions bm25=-29.22

Embeddings exist and are correctly normalized — all 330 chunks have unit-vector embeddings in both chunks.embedding (JSON) and chunks_vec (float32 binary):

mag = sqrt(sum(x*x for x in embedding))  # → 1.000000 for all chunks

KNN using the chunk's own vector finds it at rank 1 — confirming vec table integrity:

rank 1  dist=0.0000  sim=1.0000  rowid=131  sessions/44e28424.jsonl  ← itself
rank 2  dist=0.0688  sim=0.9312  rowid=132  sessions/44e28424.jsonl  ← sibling chunk
rank 3  dist=0.1401  sim=0.8599  rowid=133  sessions/44e28424.jsonl  ← sibling chunk
rank 4  dist=0.4978  sim=0.5022  rowid=346  sessions/current_session  ← large gap

But the actual query vector lands far from the chunk — calling memory_search with minScore=0 returns only ~0.25–0.36 vectorScore for session chunks, with textScore=0:

{ "vectorScore": 0.358, "textScore": 0, "score": 0.250 }

The ~0.50 cosine distance gap between the Prius session cluster and other session content confirms that session chunks are geometrically isolated — a query must land very close to the right cluster to make the KNN pool.

Expected Behavior

FTS should run in parallel with vector KNN (or at minimum with a broader candidate pool), so that keyword-strong chunks can surface even when the query vector doesn't land near them. A chunk that scores -37 BM25 on the exact query terms should not return textScore=0.

This is especially important for session transcript chunks, which are inherently in a different embedding-space region than short factual queries due to their conversational format.

Workaround

Dreaming (memory-core) partially mitigates this — distilled prose promoted to MEMORY.md embeds well and scores correctly. But raw session indexing remains unreliable for topic-specific recall.

Related Issues

#48711 — broad memory recall reliability
#51386 — graduate sessionMemory from experimental

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #installation #tensor shape #autograd error #model save/load

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: memory_search textScore=0 for session transcript chunks — FTS never runs when KNN candidate pool misses dialogue-format chunks [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Environment

Description

Root Cause

Reproduction

Evidence

Expected Behavior

Workaround

Related Issues

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: memory_search textScore=0 for session transcript chunks — FTS never runs when KNN candidate pool misses dialogue-format chunks [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Environment

Description

Root Cause

Reproduction

Evidence

Expected Behavior

Workaround

Related Issues

Still need to ship something?

RELATED_DISCOVERY

TRENDING