openclaw - 💡(How to fix) Fix Active-memory embedded run startup overhead: 44-51s before LLM call, 100% timeout rate [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#76043Fetched 2026-05-03 04:42:59
View on GitHub
Comments
2
Participants
3
Timeline
5
Reactions
2
Timeline (top)
commented ×2closed ×1cross-referenced ×1unsubscribed ×1

Active-memory plugin has 0% success rate due to consistent 45-second timeouts. Investigation reveals the bottleneck is NOT the model or memory search, but OpenClaw's embedded run framework's startup overhead.

Root Cause

Every embedded run (main session + sub-agents) incurs ~44-51 seconds of initialization before the LLM is even called:

Fix Action

Workaround

Disabled active-memory globally (config.enabled: false) while keeping plugin enabled for future re-activation.

Code Example

08:00:23 start → 08:01:17 timeout (54s elapsed, 0 chars returned)
08:07:53 start → 08:08:40 timeout (47s elapsed, 0 chars returned)
08:16:43 start → 08:17:31 timeout (48s elapsed, 0 chars returned)

---

startup: 15873ms (runtime-plugins 3160ms, model-resolution 3076ms, auth 4740ms, attempt-dispatch 4895ms)
prep: 33758ms (workspace-sandbox 16ms, core-plugin-tools 10129ms, system-prompt 7933ms, stream-setup 7389ms)
TOTAL: 49631ms — exceeds 45s lane timeout, LLM never runs

---

startup: 12-16s
prep: 29-37s
Same 44-51s total on every embedded run

---

Also 41-53s startup+prep, but no hard 45s timeout, so they complete
(but still show terrible latency: ~3.5 minutes from inbound message to reply)
RAW_BUFFERClick to expand / collapse

OpenClaw Active-Memory Bug Report: Embedded Run Startup Overhead

Summary

Active-memory plugin has 0% success rate due to consistent 45-second timeouts. Investigation reveals the bottleneck is NOT the model or memory search, but OpenClaw's embedded run framework's startup overhead.

Impact

  • Active-memory has never returned context in production (multiple gateways, consistent across sessions)
  • 45s+ latency added to every message while it times out
  • Circuit breaker missing, so failures cascade

Root Cause

Every embedded run (main session + sub-agents) incurs ~44-51 seconds of initialization before the LLM is even called:

Startup Phase (12-16s)

  • model-resolution: 3.0-3.5s (local config parsing)
  • auth: 4.5-5.1s (credential resolution from env/profile)
  • attempt-dispatch: 4.4-4.9s (internal routing)

Prep Phase (29-37s)

  • core-plugin-tools: 9.0-10.4s (loading all 9 plugin tools)
  • system-prompt: 7.5-11.3s (building prompt from bootstrap files)
  • session-resource-loader: 3.8-5.6s (loading session state)
  • bundle-tools: 1.6-1.8s (bundling tool definitions)
  • stream-setup: 7.3-8.6s (initializing API stream)

Total: 41-53 seconds before any LLM inference happens.

For comparison: actual network round-trip to Anthropic API = 150ms (TLS 43ms + TTFB 145ms).

Evidence

Test Environment

  • OpenClaw 2026.4.29
  • Anthropic claude-opus-4-6 (main model)
  • 9 plugins active (active-memory, discord, slack, telegram, memory-core, memory-wiki, google-meet, diagnostics-prometheus, acpx)
  • 73 skill directories

Timings (journalctl PID 410217, May 2 08:00-08:18 UTC)

Active-memory runs (all timeout at 45s lane deadline):

08:00:23 start → 08:01:17 timeout (54s elapsed, 0 chars returned)
08:07:53 start → 08:08:40 timeout (47s elapsed, 0 chars returned)
08:16:43 start → 08:17:31 timeout (48s elapsed, 0 chars returned)

Trace breakdown (typical run):

startup: 15873ms (runtime-plugins 3160ms, model-resolution 3076ms, auth 4740ms, attempt-dispatch 4895ms)
prep: 33758ms (workspace-sandbox 16ms, core-plugin-tools 10129ms, system-prompt 7933ms, stream-setup 7389ms)
TOTAL: 49631ms — exceeds 45s lane timeout, LLM never runs

Comparison: Old Gateway (PID 397720)

Identical timings:

startup: 12-16s
prep: 29-37s
Same 44-51s total on every embedded run

Non-active-memory embedded runs (main Telegram session):

Also 41-53s startup+prep, but no hard 45s timeout, so they complete
(but still show terrible latency: ~3.5 minutes from inbound message to reply)

Configuration Notes

Active-memory config as deployed:

  • timeoutMs: 15000 (15s) — but lane timeout is 45s, plugin timeout is ignored
  • queryMode: "recent" — reasonable for business context
  • model unset → inherits claude-opus-4-6 (expensive + no latency benefit given boot overhead)
  • allowedChatTypes: ["direct"] — only direct messages, never used in practice

What Config Changes CAN'T Fix

  • Switching to haiku won't help if 44s is burned before the API call
  • Raising timeoutMs to 60s just adds 50-60s latency per message (worse than disabled)
  • Disabling plugins would break other functionality
  • Caching bootstrap files doesn't address model-resolution, auth, or attempt-dispatch overhead

Hypotheses for Investigation

  1. Auth stage (4.5s): Should be env var lookup (~1ms). Is it making an API call to validate credentials? Or loading a large auth context?

  2. Model-resolution (3s): Should be local config lookup (~1ms). Is it querying provider endpoints or doing some other initialization?

  3. Core-plugin-tools (10s): Active-memory only needs 3 memory tools. Why load all 9 plugins' tools? Is there a tool-filtering mechanism for embedded runs?

  4. System-prompt (8s): Building prompt for ~10KB of bootstrap files. Why 8 seconds? Is it doing file I/O, parsing, or context-building in a loop?

  5. Stream-setup (8s): TLS handshake to Anthropic is 43ms. What accounts for the additional 7.9s?

Workaround

Disabled active-memory globally (config.enabled: false) while keeping plugin enabled for future re-activation.

Next Steps

Would appreciate guidance on:

  1. Is this expected performance for embedded runs in 2026.4.29?
  2. Are there optimizations planned for the startup/prep phases?
  3. Is there a way to reduce tool loading scope for sub-agents that only need 3 specific tools?
  4. Should we expect improvements in the next release, or is this architectural?

Gateway: ubuntu-s-2vcpu-2gb-90gb-intel-lon1-01 (8 CPU, 32GB RAM, 533GB disk)
Time: 2026-05-02 08:00-08:18 UTC
Timezone: Europe/London
OpenClaw version: 2026.4.29 (a448042)

extent analysis

TL;DR

The most likely fix for the Active-Memory plugin's 45-second timeout issue is to optimize the startup and prep phases of the embedded run framework, focusing on reducing the overhead of auth, model-resolution, core-plugin-tools, system-prompt, and stream-setup.

Guidance

  1. Investigate auth stage: Verify if the 4.5s auth stage is due to an API call or loading a large auth context, and consider optimizing or caching credential resolution.
  2. Optimize model-resolution: Check if model-resolution is querying provider endpoints or doing unnecessary initialization, and simplify the local config lookup.
  3. Filter core-plugin-tools: Explore the possibility of loading only the required 3 memory tools for active-memory, instead of all 9 plugins' tools, to reduce the 10s overhead.
  4. Improve system-prompt and stream-setup: Analyze the system-prompt and stream-setup phases to identify performance bottlenecks, such as file I/O, parsing, or context-building, and optimize these processes.
  5. Consider circuit breaker implementation: Implement a circuit breaker to prevent cascading failures and reduce the impact of timeouts on the system.

Example

No code snippet is provided as the issue is more related to performance optimization and configuration rather than a specific code fix.

Notes

The provided information suggests that the issue is not specific to the Active-Memory plugin, but rather a general performance problem with the embedded run framework. Optimizing the startup and prep phases may require changes to the OpenClaw framework or configuration.

Recommendation

Apply a workaround by disabling the Active-Memory plugin globally until the performance issues are addressed, as done in the provided workaround (config.enabled: false). This will prevent the 45-second timeouts and allow for further investigation and optimization of the embedded run framework.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Active-memory embedded run startup overhead: 44-51s before LLM call, 100% timeout rate [2 comments, 3 participants]