openclaw - 💡(How to fix) Fix [Feature]: batched memory embedding should batch over files

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Doing openclaw memory search foo blocks until memory indexing is complete.

I have a custom provider that can use the free tier of mistral embeddings. It exposes OpenAI compatible batch API.

It seems that openclaw doesn't batch across files.

As you can see in evidence below the batching only happens on file boundaries.

This is quite wasteful of batching capability as most files are just one or two items and most batch API providers support up to 50k requests per batch job.

The hardcoded 2s polling interval is also a bit intensive as batching is by it's nature slower and low priority and low cost at most providers. A exponential backoff up to perhaps 5 minutes check would suffice.

Root Cause

Doing openclaw memory search foo blocks until memory indexing is complete.

I have a custom provider that can use the free tier of mistral embeddings. It exposes OpenAI compatible batch API.

It seems that openclaw doesn't batch across files.

As you can see in evidence below the batching only happens on file boundaries.

This is quite wasteful of batching capability as most files are just one or two items and most batch API providers support up to 50k requests per batch job.

The hardcoded 2s polling interval is also a bit intensive as batching is by it's nature slower and low priority and low cost at most providers. A exponential backoff up to perhaps 5 minutes check would suffice.

Code Example

Memory Search (main)
Provider: openai (requested: openai)
Model: mistral/mistral-embed
Sources: memory, sessions
Indexed: 0/1441 files · 0 chunks
Dirty: yes
Store: ~/.openclaw/memory/main.sqlite
Workspace: ~/.openclaw/workspace
Dreaming: off
Embeddings: ready
By source:
  memory · 0/832 files · 0 chunks
  sessions · 0/609 files · 0 chunks
Vector store: ready
Semantic vectors: ready
Vector path: ~/.npm-global/lib/node_modules/openclaw/node_modules/sqlite-vec-linux-x64/vec0.so
FTS: ready
Embedding cache: enabled (0 entries)
Cache cap: 50000
Batch: enabled (failures 0/2)
Recall store: 4184 entries · 2 promoted · 4183 concept-tagged · 132 spaced · scripts=4180 latin, 3 mixed
Recall path: ~/.openclaw/workspace/memory/.dreams/short-term-recall.json
Recall updated: 2026-05-11T03:00:09.767Z
Dreaming artifacts: diary present · 6 corpus files · ingestion state present
Dream corpus: ~/.openclaw/workspace/memory/.dreams/session-corpus
Dream ingestion: ~/.openclaw/workspace/memory/.dreams/session-ingestion.json
Dream diary: ~/.openclaw/workspace/DREAMS.md

---

...
memory Index (main)
Provider: openai (requested: openai)
Model: mistral/mistral-embed
Sources: memory (MEMORY.md + ~/.openclaw/workspace/memory/*.md), sessions (~/.openclaw/agents/main/sessions/*.jsonl)

21:28:40 [memory] sync: indexing memory files
21:28:40 [memory] embeddings: openai batch submit
21:28:40 [memory] embeddings: openai batch created
21:28:40 [memory] openai batch batch_745ebeb0e7b84478a4bbf652 validating; waiting 2000ms
21:28:42 [memory] openai batch batch_745ebeb0e7b84478a4bbf652 validating; waiting 2000ms
21:28:44 [memory] openai batch batch_745ebeb0e7b84478a4bbf652 in_progress; waiting 2000ms
21:28:47 [memory] embeddings: openai batch submit
21:28:47 [memory] embeddings: openai batch created
21:28:47 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 validating; waiting 2000ms
21:28:49 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 validating; waiting 2000ms
21:28:51 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 validating; waiting 2000ms
21:28:53 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 validating; waiting 2000ms
21:28:55 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 in_progress; waiting 2000ms
21:28:57 [memory] embeddings: openai batch submit
21:28:57 [memory] embeddings: openai batch created
21:28:57 [memory] openai batch batch_1f08116ef8634c52bf317bc0 validating; waiting 2000ms
Indexing memory files (batch)... 2/831 · elapsed 0:18 · eta 124:57 0%^C
RAW_BUFFERClick to expand / collapse

Summary

Doing openclaw memory search foo blocks until memory indexing is complete.

I have a custom provider that can use the free tier of mistral embeddings. It exposes OpenAI compatible batch API.

It seems that openclaw doesn't batch across files.

As you can see in evidence below the batching only happens on file boundaries.

This is quite wasteful of batching capability as most files are just one or two items and most batch API providers support up to 50k requests per batch job.

The hardcoded 2s polling interval is also a bit intensive as batching is by it's nature slower and low priority and low cost at most providers. A exponential backoff up to perhaps 5 minutes check would suffice.

Problem to solve

Memory search is not avalible until embeding is finished and the way it batches right now is wasting batch capabilities.

Proposed solution

Identify where the chunking into batches happens and move that out to the loop that iterates over files instead.

Alternatives considered

No response

Impact

Costing more than necessary if payment is per batch job.

Evidence/examples

openclaw memory status:

Memory Search (main)
Provider: openai (requested: openai)
Model: mistral/mistral-embed
Sources: memory, sessions
Indexed: 0/1441 files · 0 chunks
Dirty: yes
Store: ~/.openclaw/memory/main.sqlite
Workspace: ~/.openclaw/workspace
Dreaming: off
Embeddings: ready
By source:
  memory · 0/832 files · 0 chunks
  sessions · 0/609 files · 0 chunks
Vector store: ready
Semantic vectors: ready
Vector path: ~/.npm-global/lib/node_modules/openclaw/node_modules/sqlite-vec-linux-x64/vec0.so
FTS: ready
Embedding cache: enabled (0 entries)
Cache cap: 50000
Batch: enabled (failures 0/2)
Recall store: 4184 entries · 2 promoted · 4183 concept-tagged · 132 spaced · scripts=4180 latin, 3 mixed
Recall path: ~/.openclaw/workspace/memory/.dreams/short-term-recall.json
Recall updated: 2026-05-11T03:00:09.767Z
Dreaming artifacts: diary present · 6 corpus files · ingestion state present
Dream corpus: ~/.openclaw/workspace/memory/.dreams/session-corpus
Dream ingestion: ~/.openclaw/workspace/memory/.dreams/session-ingestion.json
Dream diary: ~/.openclaw/workspace/DREAMS.md

openclaw memory index --force --verbose:

...
memory Index (main)
Provider: openai (requested: openai)
Model: mistral/mistral-embed
Sources: memory (MEMORY.md + ~/.openclaw/workspace/memory/*.md), sessions (~/.openclaw/agents/main/sessions/*.jsonl)

21:28:40 [memory] sync: indexing memory files
21:28:40 [memory] embeddings: openai batch submit
21:28:40 [memory] embeddings: openai batch created
21:28:40 [memory] openai batch batch_745ebeb0e7b84478a4bbf652 validating; waiting 2000ms
21:28:42 [memory] openai batch batch_745ebeb0e7b84478a4bbf652 validating; waiting 2000ms
21:28:44 [memory] openai batch batch_745ebeb0e7b84478a4bbf652 in_progress; waiting 2000ms
21:28:47 [memory] embeddings: openai batch submit
21:28:47 [memory] embeddings: openai batch created
21:28:47 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 validating; waiting 2000ms
21:28:49 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 validating; waiting 2000ms
21:28:51 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 validating; waiting 2000ms
21:28:53 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 validating; waiting 2000ms
21:28:55 [memory] openai batch batch_e5dec2ec999d4c44b4a96102 in_progress; waiting 2000ms
21:28:57 [memory] embeddings: openai batch submit
21:28:57 [memory] embeddings: openai batch created
21:28:57 [memory] openai batch batch_1f08116ef8634c52bf317bc0 validating; waiting 2000ms
Indexing memory files (batch)... 2/831 · elapsed 0:18 · eta 124:57 0%^C

Additional information

I don't have any access to any premium models and I have tried implement this feature myself but using gemma4 and opencode didn't succeed to implement this successfully.

That should be a hint that the code is overly complex and should be simplified.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Feature]: batched memory embedding should batch over files