openclaw - ✅(Solved) Fix [Performance] memory_search cold start ~5-7s — needs module pre-warming [1 pull requests, 1 participants]

bthaha123-Carl · 2026-04-23T03:28:43Z

[openclaw] PR 70596: perf memory : prewarm explicit local embeddings on gateway startup - Repository: openclaw/openclaw - Author: taosiyuan163 - State: open |… # PR #70596: perf(memory): prewarm explicit local embeddings on gateway startup - Repository: openclaw/openclaw - Author: taosiyuan163 - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/70596 ## Description (problem / solution / changelog) ## Summary - Problem: builtin `memory_search` with explicit `memorySearch.provider = "local"` still pays model/context initialization on the first post-startup search because gateway startup only warmed QMD. - Why it matters: the first memory-backed reply after a gateway restart absorbs the local embedding cold start instead of the post-attach startup path. - What changed: extend `startGatewayMemoryBackend` to prewarm builtin local managers by calling `probeEmbeddingAvailability()` after manager creation, and add gateway startup tests for explicit-local, skip, and failure logging paths. - What did NOT change (scope boundary): this does not prewarm `provider = "auto"` or remote embedding providers, so startup avoids new remote network calls or auth/cost side effects. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [x] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes # - Related #70460 - [ ] This PR fixes a bug or regression ## Root Cause (if applicable) - Root cause: post-attach startup only initialized QMD memory, so builtin local memory waited until the first real search to create the manager and load the local embedding context. - Missing detection / guardrail: gateway startup coverage did not assert builtin explicit-local prewarm behavior or failure logging. - Contributing context (if known): `purpose: "status"` managers bypass the builtin cache, so a status-only probe would not have helped the first real search. ## Regression Test Plan (if applicable) - Coverage level that should have caught this: - [ ] Unit test - [x] Seam / integration test - [ ] End-to-end test - [ ] Existing coverage already sufficient - Target test or file: `src/gateway/server-startup-memory.test.ts` - Scenario the test should lock in: explicit local builtin memory runs `probeEmbeddingAvailability()` at startup, non-local builtin configs skip startup work, and probe failures degrade to warnings. - Why this is the smallest reliable guardrail: it exercises the gateway startup decision point without requiring real local model assets. - Existing test that already covers this (if any): existing QMD startup tests in the same file. - If no new test is added, why not: N/A ## User-visible / Behavior Changes - Gateways using builtin memory with `agents.*.memorySearch.provider = "local"` now shift local embedding cold-start work into post-attach startup, so the first real `memory_search` request avoids paying that initialization cost. - No behavior change for QMD, `provider = "auto"`, or remote memory embedding providers. ## Diagram (if applicable) ```text Before: gateway boot -> first builtin local memory_search -> init local embedding context -> delayed first result After: gateway boot -> post-attach startup prewarm -> first builtin local memory_search -> reuse warmed context ``` ## Security Impact (required) - New permissions/capabilities? (`Yes/No`) No - Secrets/tokens handling changed? (`Yes/No`) No - New/changed network calls? (`Yes/No`) No - Command/tool execution surface changed? (`Yes/No`) No - Data access scope changed? (`Yes/No`) No - If any `Yes`, explain risk + mitigation: ## Repro + Verification ### Environment - OS: macOS - Runtime/container: Node.js via `pnpm` - Model/provider: builtin memory with explicit local embedding provider - Integration/channel (if any): N/A - Relevant config (redacted): `memory.backend=builtin`, `agents.defaults.memorySearch.provider=local` ### Steps 1. Start the gateway with builtin memory and an explicit local memory provider. 2. Let post-attach startup run. 3. Trigger the first `memory_search` request after startup. ### Expected - The first real search reuses a manager whose local embedding provider has already been prewarmed. - Non-local builtin configs still skip startup warmup. - Failed prewarm attempts only log warnings. ### Actual - Before this change, only QMD was warmed at startup, so builtin local memory paid the cold-start cost on the first real search. ## Evidence - [x] Failing test/log before + passing after - [ ] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) ## Human Verification (required) - Verified scenarios: `pnpm test src/gateway/server-startup-memory.test.ts`; `pnpm build` - Edge cases checked: builtin non-local skip path, builtin local probe fai

openclaw2026-04-23 03:28:43

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#70460•Fetched 2026-04-24 05:57:45

View on GitHub

Comments

Participants

Timeline

Reactions

Author

bthaha123-Carl

Participants

bthaha123-Carl

Timeline (top)

cross-referenced ×1

Root Cause

Root Cause Analysis:

Fix Action

Fix / Workaround

Current Workaround

PR fix notes

PR #70596: perf(memory): prewarm explicit local embeddings on gateway startup

Repository: openclaw/openclaw
Author: taosiyuan163
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/70596

Description (problem / solution / changelog)

Summary

Problem: builtin memory_search with explicit memorySearch.provider = "local" still pays model/context initialization on the first post-startup search because gateway startup only warmed QMD.
Why it matters: the first memory-backed reply after a gateway restart absorbs the local embedding cold start instead of the post-attach startup path.
What changed: extend startGatewayMemoryBackend to prewarm builtin local managers by calling probeEmbeddingAvailability() after manager creation, and add gateway startup tests for explicit-local, skip, and failure logging paths.
What did NOT change (scope boundary): this does not prewarm provider = "auto" or remote embedding providers, so startup avoids new remote network calls or auth/cost side effects.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related #70460
This PR fixes a bug or regression

Root Cause (if applicable)

Root cause: post-attach startup only initialized QMD memory, so builtin local memory waited until the first real search to create the manager and load the local embedding context.
Missing detection / guardrail: gateway startup coverage did not assert builtin explicit-local prewarm behavior or failure logging.
Contributing context (if known): purpose: "status" managers bypass the builtin cache, so a status-only probe would not have helped the first real search.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/gateway/server-startup-memory.test.ts
Scenario the test should lock in: explicit local builtin memory runs probeEmbeddingAvailability() at startup, non-local builtin configs skip startup work, and probe failures degrade to warnings.
Why this is the smallest reliable guardrail: it exercises the gateway startup decision point without requiring real local model assets.
Existing test that already covers this (if any): existing QMD startup tests in the same file.
If no new test is added, why not: N/A

User-visible / Behavior Changes

Gateways using builtin memory with agents.*.memorySearch.provider = "local" now shift local embedding cold-start work into post-attach startup, so the first real memory_search request avoids paying that initialization cost.
No behavior change for QMD, provider = "auto", or remote memory embedding providers.

Diagram (if applicable)

Before:
gateway boot -> first builtin local memory_search -> init local embedding context -> delayed first result

After:
gateway boot -> post-attach startup prewarm -> first builtin local memory_search -> reuse warmed context

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation:

Repro + Verification

Environment

OS: macOS
Runtime/container: Node.js via pnpm
Model/provider: builtin memory with explicit local embedding provider
Integration/channel (if any): N/A
Relevant config (redacted): memory.backend=builtin, agents.defaults.memorySearch.provider=local

Steps

Start the gateway with builtin memory and an explicit local memory provider.
Let post-attach startup run.
Trigger the first memory_search request after startup.

Expected

The first real search reuses a manager whose local embedding provider has already been prewarmed.
Non-local builtin configs still skip startup warmup.
Failed prewarm attempts only log warnings.

Actual

Before this change, only QMD was warmed at startup, so builtin local memory paid the cold-start cost on the first real search.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios: pnpm test src/gateway/server-startup-memory.test.ts; pnpm build
Edge cases checked: builtin non-local skip path, builtin local probe failure logging, existing QMD startup path
What you did not verify: end-to-end timing against a real local model cache/download on a fresh gateway boot

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps:

Risks and Mitigations

Risk: explicit local memory configs now do extra startup work and could log a warning earlier in the boot cycle.
Mitigation: the prewarm runs only for memory.backend=builtin plus explicit provider=local, after post-attach startup, and failures degrade to warnings without blocking startup.

Changed files

src/gateway/server-startup-memory.test.ts (modified, +63/-5)
src/gateway/server-startup-memory.ts (modified, +31/-4)

RAW_BUFFERClick to expand / collapse

Problem Description

memory_search tool has a significant cold start latency (~5-7 seconds) on first invocation after gateway starts. Subsequent calls are fast (~50ms).

Root Cause Analysis:

Each memory_search call goes through a dynamic import chain:

createMemorySearchTool
- loadMemoryToolRuntime() [has module-level Promise cache]
  - tools.runtime-DFHmgPiY.js
    - memory-D1KN22vK.js
      - getMemorySearchManager()
        
        MemoryIndexManager.get()
        
        manager-runtime.js
        
        manager-ChtEZeo0.js [embedding provider loading]
        
        memory-core-host-engine-foundation-CdUASPXi.js
        
        ONNX model / embedding model initialization

While loadMemoryToolRuntime() itself is cached (memoryToolRuntimePromise), the downstream embedding provider initialization (manager-runtime.js → manager-ChtEZeo0.js) has no caching and re-initializes on every cold gateway restart.

Additionally, the default memorySearch config enables hybrid search (0.7 vector + 0.3 FTS), which runs two search pipelines on every query.

Expected Behavior

Either:

Module pre-warming: memory-core should pre-load its modules and embedding providers on gateway startup, so the first memory_search call is as fast as subsequent ones.
Or: Expose a memory.warmup() or gateway.on('ready') hook that allows plugins/tasks to trigger pre-initialization.

Current Workaround

Creating a periodic cron task to invoke memory_search every 10 minutes has limited effectiveness because isolated sessions are fresh Node.js processes and the real bottleneck is in the gateway process module loading.

Environment

OpenClaw version: 2026.4.x
Node.js environment (Windows)
Memory backend: builtin (SQLite + local embedding)
Config: memorySearch.hybrid.enabled = true (default)

Impact

Current cold start: ~5700ms
After pre-warm / cached: ~50ms
This affects every new gateway session, making the first memory-dependent response significantly slower.

extent analysis

TL;DR

Implementing a pre-warming mechanism on gateway startup is likely to significantly reduce the cold start latency of the memory_search tool.

Guidance

Investigate adding a startup hook in memory-core to call getMemorySearchManager() with purpose: 'status' to trigger lazy initialization before the first real query.
Consider exposing a gateway.on('ready') or registerStartupTask() API to allow external plugins to register pre-warming tasks.
Evaluate the feasibility of caching the result of MemoryIndexManager.get() at the module level to reuse the initialized manager in repeated calls.
Review the current configuration and consider disabling hybrid search if only vector search is needed, as this could reduce cold start work by ~40%.

Example

// Potential example of pre-warming on gateway startup
gateway.on('ready', () => {
  getMemorySearchManager('status'); // Trigger lazy initialization
});

Notes

The effectiveness of these suggestions may depend on the specific implementation details of the memory_search tool and the gateway. Additionally, the OpenClaw version (2026.4.x) and Node.js environment may have implications for the available APIs and caching mechanisms.

Recommendation

Apply workaround: Implement a pre-warming mechanism on gateway startup, as this is likely to have the most significant impact on reducing cold start latency.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #mixed precision #training loop #device allocation #model download

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Performance] memory_search cold start ~5-7s — needs module pre-warming [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Current Workaround

PR fix notes

PR #70596: perf(memory): prewarm explicit local embeddings on gateway startup

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

Problem Description

Expected Behavior

Current Workaround

Environment

Suggested Solutions (Priority Order)

Impact

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING