openclaw - ✅(Solved) Fix [Bug]: memorySearch.remote.batch.concurrency = 1 has no effect — indexing floods Ollama with concurrent requests [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#66822Fetched 2026-04-15 06:24:17
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Participants
Timeline (top)
labeled ×2cross-referenced ×1referenced ×1

agents.defaults.memorySearch.remote.batch.concurrency = 1 has no effect — OpenClaw sends 30+ concurrent embedding requests during indexing, causing exec sessions to be killed before any chunk is indexed.

Root Cause

agents.defaults.memorySearch.remote.batch.concurrency = 1 has no effect — OpenClaw sends 30+ concurrent embedding requests during indexing, causing exec sessions to be killed before any chunk is indexed.

Fix Action

Fix / Workaround

  • When batch.enabled = false was set (with concurrency still = 1), behavior was unchanged — batch requests still fired concurrently. This suggests the config path may not be fully wired to the batch execution layer.
  • No regression period identified — this issue appears to have existed since batch concurrency config was introduced.
  • Workaround: None that preserves indexing — users must accept that openclaw memory index will hang indefinitely and never complete on single-model Ollama setups.
  • The 44s latency per request is not a model problem — it's queuing contention from OpenClaw's parallel requests all waiting for OLLAMA_MAX_LOADED_MODELS=1.

PR fix notes

PR #66931: fix(memory): respect user batch.concurrency even when batch mode is disabled

Description (problem / solution / changelog)

Summary

  • getIndexConcurrency() ignored the user's configured agents.defaults.memorySearch.remote.batch.concurrency when batch mode was disabled (e.g. with Ollama which lacks the batch API)
  • Fell back to a hardcoded concurrency of 4, causing 30+ concurrent embedding requests to flood Ollama, exhausting resources and resulting in SIGKILLs
  • Now checks the raw user config first and respects it regardless of whether batch mode is enabled

Fixes #66822

Test plan

  • Set agents.defaults.memorySearch.remote.batch.concurrency = 1 with Ollama provider
  • Run openclaw memory index --verbose --force on a workspace with 50+ memory files
  • Verify only 1 embedding request is sent at a time (no concurrent flooding)

🤖 Generated with Claude Code

Changed files

  • extensions/memory-core/src/memory/manager-embedding-ops.ts (modified, +7/-1)

Code Example

"agents": {
  "defaults": {
    "memorySearch": {
      "provider": "openai",
      "remote": {
        "baseUrl": "http://127.0.0.1:11434/v1",
        "apiKey": "ollama-local"
      },
      "model": "qwen3-embedding:4b",
      "remote": {
        "batch": {
          "enabled": true,
          "concurrency": 1
        }
      }
    }
  }
}

---

Ollama server log (/opt/homebrew/var/log/ollama.log) during openclaw memory index --verbose:

[GIN] 2026/04/14 - 16:03:22 | 400 | 15.19s | 127.0.0.1 | POST "/v1/embeddings"
[GIN] 2026/04/14 - 16:03:22 | 400 |  5.89s | 127.0.0.1 | POST "/v1/embeddings"
[GIN] 2026/04/14 - 16:03:22 | 400 | 20.98s | 127.0.0.1 | POST "/v1/embeddings"
time=... msg="aborting embedding request due to client closing the connection"

Verbose CLI output (openclaw memory index --verbose --force):

[memory] sync: indexing memory files
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
... (30+ total)
Process exited with signal SIGKILL

openclaw memory status after:
Indexed: 0/51 files · 0 chunks
Dirty: yes
Batch: disabled (failures 0/2)
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

agents.defaults.memorySearch.remote.batch.concurrency = 1 has no effect — OpenClaw sends 30+ concurrent embedding requests during indexing, causing exec sessions to be killed before any chunk is indexed.

Steps to reproduce

  1. Set agents.defaults.memorySearch.remote.batch.concurrency = 1 in openclaw.json
  2. Run openclaw memory index --verbose --force on a workspace with 50+ memory files
  3. Observe: 30+ "batch start" messages appear within seconds, then process is SIGKILLed after ~22 minutes with 0 chunks indexed

Expected behavior

With concurrency = 1, OpenClaw should send embedding requests one at a time. Each request to Ollama with OLLAMA_MAX_LOADED_MODELS=1 takes ~44s (serialized). At 1 request at a time, a small batch should complete without timeout.

Actual behavior

Verbose output shows [memory] embeddings: batch start repeated 30+ times in rapid succession (within seconds). Exec session is SIGKILLed after ~22 minutes. No chunks are indexed. The concurrency setting appears to have no effect.

OpenClaw version

2026.4.14

Operating system

Mac OS 26.3.1

Install method

nom

Model

ollama/qwen3:4b embedding

Provider / routing chain

openclaw -> ollama (local, /v1 endpoint) -> qwen3-embedding:4b

Additional provider/model setup details

"agents": {
  "defaults": {
    "memorySearch": {
      "provider": "openai",
      "remote": {
        "baseUrl": "http://127.0.0.1:11434/v1",
        "apiKey": "ollama-local"
      },
      "model": "qwen3-embedding:4b",
      "remote": {
        "batch": {
          "enabled": true,
          "concurrency": 1
        }
      }
    }
  }
}

Provider: Ollama via OpenAI-compatible endpoint (/v1/embeddings) Embedding model: qwen3-embedding:4b (also tested with 8b — same behavior) OLLAMA_MAX_LOADED_MODELS=1 — single model at a time in VRAM OLLAMA_NUM_PARALLEL=1 — Ollama's server-side parallel cap (1) Config model: openclaw 2026.4.14 (323493f)

Logs, screenshots, and evidence

Ollama server log (/opt/homebrew/var/log/ollama.log) during openclaw memory index --verbose:

[GIN] 2026/04/14 - 16:03:22 | 400 | 15.19s | 127.0.0.1 | POST "/v1/embeddings"
[GIN] 2026/04/14 - 16:03:22 | 400 |  5.89s | 127.0.0.1 | POST "/v1/embeddings"
[GIN] 2026/04/14 - 16:03:22 | 400 | 20.98s | 127.0.0.1 | POST "/v1/embeddings"
time=... msg="aborting embedding request due to client closing the connection"

Verbose CLI output (openclaw memory index --verbose --force):

[memory] sync: indexing memory files
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
[memory] embeddings: batch start
... (30+ total)
Process exited with signal SIGKILL

openclaw memory status after:
Indexed: 0/51 files · 0 chunks
Dirty: yes
Batch: disabled (failures 0/2)

Impact and severity

  • Affected users: Any OpenClaw user with memory indexing enabled and an Ollama backend that has OLLAMA_MAX_LOADED_MODELS=1 (common for VRAM-constrained setups — e.g., Apple Silicon Mac with multiple large models)
  • Severity: Blocks workflow — memory indexing cannot complete on affected setups
  • Frequency: Always — every openclaw memory index run exhibits this behavior
  • Consequence: Users cannot reindex memory. The vector store becomes stale. Search quality degrades over time as new memory files go unindexed.

Additional information

  • When batch.enabled = false was set (with concurrency still = 1), behavior was unchanged — batch requests still fired concurrently. This suggests the config path may not be fully wired to the batch execution layer.
  • No regression period identified — this issue appears to have existed since batch concurrency config was introduced.
  • Workaround: None that preserves indexing — users must accept that openclaw memory index will hang indefinitely and never complete on single-model Ollama setups.
  • The 44s latency per request is not a model problem — it's queuing contention from OpenClaw's parallel requests all waiting for OLLAMA_MAX_LOADED_MODELS=1.

extent analysis

TL;DR

The issue can be mitigated by ensuring the concurrency setting is properly applied to the batch execution layer in OpenClaw.

Guidance

  1. Verify configuration: Double-check that the agents.defaults.memorySearch.remote.batch.concurrency setting is correctly set to 1 in the openclaw.json file.
  2. Investigate batch execution layer: Examine the OpenClaw code to ensure that the concurrency setting is being properly applied to the batch execution layer, as the current behavior suggests it may not be fully wired.
  3. Test with debug logging: Enable debug logging in OpenClaw to gain more insight into the batch execution process and how the concurrency setting is being applied.
  4. Consider a temporary workaround: If the issue cannot be immediately resolved, consider implementing a temporary workaround, such as reducing the number of memory files being indexed or using a different backend that does not have the same concurrency limitations.

Example

No code snippet is provided as the issue is more related to configuration and the internal workings of OpenClaw rather than a specific code problem.

Notes

The root cause of the issue appears to be a mismatch between the configured concurrency setting and the actual behavior of the batch execution layer in OpenClaw. Further investigation is needed to determine the exact cause and implement a fix.

Recommendation

Apply a workaround by reducing the number of concurrent requests or using a different backend until the issue can be fully resolved, as the current behavior is blocking workflow and degrading search quality.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

With concurrency = 1, OpenClaw should send embedding requests one at a time. Each request to Ollama with OLLAMA_MAX_LOADED_MODELS=1 takes ~44s (serialized). At 1 request at a time, a small batch should complete without timeout.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: memorySearch.remote.batch.concurrency = 1 has no effect — indexing floods Ollama with concurrent requests [1 pull requests, 1 participants]