openclaw - ✅(Solved) Fix [Performance] memory_search cold start ~5-7s — needs module pre-warming [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70460Fetched 2026-04-24 05:57:45
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Participants
Timeline (top)
cross-referenced ×1

Root Cause

Root Cause Analysis:

Fix Action

Fix / Workaround

Current Workaround

PR fix notes

PR #70596: perf(memory): prewarm explicit local embeddings on gateway startup

Description (problem / solution / changelog)

Summary

  • Problem: builtin memory_search with explicit memorySearch.provider = "local" still pays model/context initialization on the first post-startup search because gateway startup only warmed QMD.
  • Why it matters: the first memory-backed reply after a gateway restart absorbs the local embedding cold start instead of the post-attach startup path.
  • What changed: extend startGatewayMemoryBackend to prewarm builtin local managers by calling probeEmbeddingAvailability() after manager creation, and add gateway startup tests for explicit-local, skip, and failure logging paths.
  • What did NOT change (scope boundary): this does not prewarm provider = "auto" or remote embedding providers, so startup avoids new remote network calls or auth/cost side effects.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #70460
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: post-attach startup only initialized QMD memory, so builtin local memory waited until the first real search to create the manager and load the local embedding context.
  • Missing detection / guardrail: gateway startup coverage did not assert builtin explicit-local prewarm behavior or failure logging.
  • Contributing context (if known): purpose: "status" managers bypass the builtin cache, so a status-only probe would not have helped the first real search.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-startup-memory.test.ts
  • Scenario the test should lock in: explicit local builtin memory runs probeEmbeddingAvailability() at startup, non-local builtin configs skip startup work, and probe failures degrade to warnings.
  • Why this is the smallest reliable guardrail: it exercises the gateway startup decision point without requiring real local model assets.
  • Existing test that already covers this (if any): existing QMD startup tests in the same file.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • Gateways using builtin memory with agents.*.memorySearch.provider = "local" now shift local embedding cold-start work into post-attach startup, so the first real memory_search request avoids paying that initialization cost.
  • No behavior change for QMD, provider = "auto", or remote memory embedding providers.

Diagram (if applicable)

Before:
gateway boot -> first builtin local memory_search -> init local embedding context -> delayed first result

After:
gateway boot -> post-attach startup prewarm -> first builtin local memory_search -> reuse warmed context

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node.js via pnpm
  • Model/provider: builtin memory with explicit local embedding provider
  • Integration/channel (if any): N/A
  • Relevant config (redacted): memory.backend=builtin, agents.defaults.memorySearch.provider=local

Steps

  1. Start the gateway with builtin memory and an explicit local memory provider.
  2. Let post-attach startup run.
  3. Trigger the first memory_search request after startup.

Expected

  • The first real search reuses a manager whose local embedding provider has already been prewarmed.
  • Non-local builtin configs still skip startup warmup.
  • Failed prewarm attempts only log warnings.

Actual

  • Before this change, only QMD was warmed at startup, so builtin local memory paid the cold-start cost on the first real search.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: pnpm test src/gateway/server-startup-memory.test.ts; pnpm build
  • Edge cases checked: builtin non-local skip path, builtin local probe failure logging, existing QMD startup path
  • What you did not verify: end-to-end timing against a real local model cache/download on a fresh gateway boot

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: explicit local memory configs now do extra startup work and could log a warning earlier in the boot cycle.
  • Mitigation: the prewarm runs only for memory.backend=builtin plus explicit provider=local, after post-attach startup, and failures degrade to warnings without blocking startup.

Changed files

  • src/gateway/server-startup-memory.test.ts (modified, +63/-5)
  • src/gateway/server-startup-memory.ts (modified, +31/-4)
RAW_BUFFERClick to expand / collapse

Problem Description

memory_search tool has a significant cold start latency (~5-7 seconds) on first invocation after gateway starts. Subsequent calls are fast (~50ms).

Root Cause Analysis:

Each memory_search call goes through a dynamic import chain:

  • createMemorySearchTool
    • loadMemoryToolRuntime() [has module-level Promise cache]
      • tools.runtime-DFHmgPiY.js
        • memory-D1KN22vK.js
          • getMemorySearchManager()
            • MemoryIndexManager.get()
              • manager-runtime.js
                • manager-ChtEZeo0.js [embedding provider loading]
                  • memory-core-host-engine-foundation-CdUASPXi.js
                    • ONNX model / embedding model initialization

While loadMemoryToolRuntime() itself is cached (memoryToolRuntimePromise), the downstream embedding provider initialization (manager-runtime.js → manager-ChtEZeo0.js) has no caching and re-initializes on every cold gateway restart.

Additionally, the default memorySearch config enables hybrid search (0.7 vector + 0.3 FTS), which runs two search pipelines on every query.

Expected Behavior

Either:

  1. Module pre-warming: memory-core should pre-load its modules and embedding providers on gateway startup, so the first memory_search call is as fast as subsequent ones.
  2. Or: Expose a memory.warmup() or gateway.on('ready') hook that allows plugins/tasks to trigger pre-initialization.

Current Workaround

Creating a periodic cron task to invoke memory_search every 10 minutes has limited effectiveness because isolated sessions are fresh Node.js processes and the real bottleneck is in the gateway process module loading.

Environment

  • OpenClaw version: 2026.4.x
  • Node.js environment (Windows)
  • Memory backend: builtin (SQLite + local embedding)
  • Config: memorySearch.hybrid.enabled = true (default)

Suggested Solutions (Priority Order)

  1. Pre-warm on gateway startup: Add a startup hook in memory-core that calls getMemorySearchManager() with purpose: 'status' to trigger lazy initialization before first real query.
  2. Expose startup event: Provide a gateway.on('ready') or registerStartupTask() API so external plugins can register pre-warming tasks.
  3. Hybrid search opt-out: Document and make it easier to disable hybrid search for users who only need vector search, reducing cold start work by ~40%.
  4. Module-level caching: Cache the result of MemoryIndexManager.get() at the module level so repeated calls within the same gateway process reuse the initialized manager.

Impact

  • Current cold start: ~5700ms
  • After pre-warm / cached: ~50ms
  • This affects every new gateway session, making the first memory-dependent response significantly slower.

extent analysis

TL;DR

Implementing a pre-warming mechanism on gateway startup is likely to significantly reduce the cold start latency of the memory_search tool.

Guidance

  • Investigate adding a startup hook in memory-core to call getMemorySearchManager() with purpose: 'status' to trigger lazy initialization before the first real query.
  • Consider exposing a gateway.on('ready') or registerStartupTask() API to allow external plugins to register pre-warming tasks.
  • Evaluate the feasibility of caching the result of MemoryIndexManager.get() at the module level to reuse the initialized manager in repeated calls.
  • Review the current configuration and consider disabling hybrid search if only vector search is needed, as this could reduce cold start work by ~40%.

Example

// Potential example of pre-warming on gateway startup
gateway.on('ready', () => {
  getMemorySearchManager('status'); // Trigger lazy initialization
});

Notes

The effectiveness of these suggestions may depend on the specific implementation details of the memory_search tool and the gateway. Additionally, the OpenClaw version (2026.4.x) and Node.js environment may have implications for the available APIs and caching mechanisms.

Recommendation

Apply workaround: Implement a pre-warming mechanism on gateway startup, as this is likely to have the most significant impact on reducing cold start latency.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Performance] memory_search cold start ~5-7s — needs module pre-warming [1 pull requests, 1 participants]