openclaw - ✅(Solved) Fix [Windows] Severe Gateway slowdown (~78s per message) - Root Cause: Plugin tool factory calls uncached [3 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75956Fetched 2026-05-03 04:43:56
View on GitHub
Comments
2
Participants
3
Timeline
10
Reactions
2
Assignees
Timeline (top)
cross-referenced ×5commented ×2assigned ×1closed ×1

OpenClaw on native Windows experiences severe Gateway slowdowns where even a simple message like 你好 takes ~90 seconds total, with ~78 seconds spent in core-plugin-tools stage. The root cause has been identified: plugin tool factory calls (entry.factory()) are executed on every request with no caching at all.

Root Cause

OpenClaw on native Windows experiences severe Gateway slowdowns where even a simple message like 你好 takes ~90 seconds total, with ~78 seconds spent in core-plugin-tools stage. The root cause has been identified: plugin tool factory calls (entry.factory()) are executed on every request with no caching at all.

Fix Action

Fix / Workaround

Why the Previous Patch Didn't Help

An earlier patch memoized resolveProviderRuntimePlugin(), but:

  • model-resolution: 662ms (already fast with patch)
  • auth: 10110ms (acceptable)
  • core-plugin-tools: 78580ms (the real problem - unchanged by patch)

Original startup stage timing

workspace: 1ms
runtime-plugins: 2ms
hooks: 1ms
model-resolution: 26297ms
auth: 48433ms
context-engine: 1ms
attempt-dispatch: 41969ms
total: 116704ms

PR fix notes

PR #75974: fix(plugins): cache tool factory results to avoid repeated calls

Description (problem / solution / changelog)

Summary

resolvePluginTools() calls entry.factory() on every request with no caching, causing ~78s delay per message on Windows with 57 bundled plugins.

The root cause was identified in #75956:

  • core-plugin-tools stage takes ~78580ms
  • entry.factory(params.context) executed for ALL plugin entries on EVERY request
  • No caching of factory results

Fix

Add pluginToolFactoryCache Map with pluginId as key:

  1. Check cache before calling factory
  2. Cache successful factory results
  3. Cache hit shows durationMs=0 in factory timings
  4. Add resetPluginToolFactoryCache() for tests

Test Coverage

Added test case verifying:

  • Factory is called only once across multiple resolvePluginTools() calls
  • Second call uses cached result
  • Cached tool is same object (identical reference)

Performance Impact

  • First request: same cost (factory still runs)
  • Second+ requests: near-zero cost for cached plugins
  • Expected improvement: ~78s → ~0s for core-plugin-tools stage on subsequent messages

Design Notes

Cache Key: pluginId

  • Factory results are typically stable across requests
  • Most factory results depend only on config (not sessionKey, sessionId)
  • Dynamic factories (e.g., lobster with sandboxed check) still work:
    • First call caches result
    • If context changes behavior (e.g., ctx.sandboxed), factory result may differ
    • Future enhancement: consider context-hash in cache key for dynamic factories

Fixes #75956

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/plugins/tools.optional.test.ts (modified, +36/-1)
  • src/plugins/tools.ts (modified, +56/-12)

PR #76067: feat: plan plugin tools from static descriptors

Description (problem / solution / changelog)

What changed

This moves plugin tool prompt-time planning toward static descriptors instead of runtime factory discovery.

  • Adds manifest-backed toolMetadata.<tool>.descriptor support for plugin-owned tools.
  • Plans descriptor-backed plugin tools without loading plugin runtime during reply/tool prep.
  • Loads the owning plugin runtime only when a descriptor-backed tool is actually executed.
  • Migrates xAI and memory-core tools to static descriptors.
  • Preserves xAI per-tool disablement through descriptor availability.
  • Preserves memory tool visibility through active-agent memory request facts.
  • Keeps runtime api.registerTool(...) as the execution binding, not the prompt-time discovery source.
  • Removes the plugin tool factory-result cache.

Why

The cache route improves repeated runtime probing, but it does not remove the first-hit runtime discovery/materialization cost.

Fresh benchmark against current origin/main:

origin/main cache baseline:
memory enabled cold:   19547.88ms
memory disabled cold:  21533.90ms

descriptor branch:
memory enabled cold:     249.60ms
memory disabled cold:    218.38ms

That is a 98.7% to 99.0% cold-path reduction for prompt-time plugin tool planning.

Warm medians are close, with the cache path slightly lower, but descriptors remove the worst user-visible first-hit cost instead of hiding it behind memoization.

Docs

Updated:

  • docs/plugins/manifest.md
  • docs/plugins/building-plugins.md
  • docs/tools/index.md

These now describe tool descriptors, availability expressions, and the split between static tool shape and runtime execution binding.

Verification

pnpm test src/tools/availability.test.ts src/tools/planner.test.ts src/plugins/tools.optional.test.ts src/plugins/manifest-registry.test.ts src/plugins/tool-descriptors.test.ts src/plugins/contracts/plugin-tool-contracts.test.ts -- --typecheck --reporter=dot
pnpm exec oxfmt --check --threads=1 <changed files>
git diff --check origin/main...HEAD
pnpm build
git merge-tree --write-tree --merge-base=$(git merge-base HEAD origin/main) HEAD origin/main

All passed.

Refs #75956.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • docs/plugins/building-plugins.md (modified, +5/-0)
  • docs/tools/index.md (modified, +6/-0)
  • src/plugins/tool-descriptor-cache.ts (added, +146/-0)
  • src/plugins/tool-factory-cache.ts (removed, +0/-102)
  • src/plugins/tools.optional.test.ts (modified, +64/-108)
  • src/plugins/tools.ts (modified, +327/-63)

PR #76079: refactor: cache plugin tool descriptors

Description (problem / solution / changelog)

Summary

  • Problem: #76067 moved plugin tool planning toward manifest-shipped static descriptors, duplicating runtime tool schema/description in plugin manifests and leaving stale history on the PR branch.
  • Why it matters: prompt-time plugin tool planning still needs to avoid repeated runtime factory work, but plugin runtime registration should remain the source of truth for tool shape and execution.
  • What changed: replace the plugin tool factory-result cache with a descriptor cache captured from validated api.registerTool(...) results, keyed by plugin source, contract, and request/factory context; cached wrappers reload the owning plugin only when the tool executes.
  • What did NOT change (scope boundary): no manifest toolMetadata.<tool>.descriptor contract, no xAI/memory-core descriptor migration, and no plugin-owned execution behavior moved into core.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #76067
  • Related #75956
  • This PR fixes a bug or regression

This is a clean replacement for #76067. Source work by @shakkernerd is preserved with a co-author trailer.

Root Cause (if applicable)

  • Root cause: N/A
  • Missing detection / guardrail: N/A
  • Contributing context (if known): N/A

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/plugins/tools.optional.test.ts
  • Scenario the test should lock in: descriptor cache reuse for equivalent contexts, no reuse across sandbox context changes, and cached execution for factory registrations that rely on manifest declaredNames instead of explicit opts.name.
  • Why this is the smallest reliable guardrail: resolvePluginTools is the seam that decides descriptor cache reads/writes and runtime tool execution fallback.
  • Existing test that already covers this (if any): existing optional tool and plugin contract tests cover allowlist/conflict/malformed-tool behavior.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Plugin tools still register via api.registerTool(...) and declare ownership in contracts.tools. Repeated matching tool-planning requests can skip plugin runtime loading after the descriptor is captured; actual execution still loads the live plugin tool.

Diagram (if applicable)

Before:
resolve tools -> load plugin runtime -> call tool factory -> return tool objects

After:
first matching request -> load runtime -> validate tool -> cache descriptor -> return tool
later matching request -> cached descriptor wrapper -> execution loads owning plugin -> live tool execute

Security Impact (required)

  • New permissions/capabilities? (Yes/No): No
  • Secrets/tokens handling changed? (Yes/No): No
  • New/changed network calls? (Yes/No): No
  • Command/tool execution surface changed? (Yes/No): No
  • Data access scope changed? (Yes/No): No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS local checkout
  • Runtime/container: Node/pnpm repo scripts
  • Model/provider: N/A
  • Integration/channel (if any): plugin tool resolution
  • Relevant config (redacted): test fixtures only

Steps

  1. Run focused plugin tool tests.
  2. Run core prod/test typecheck lanes.
  3. Run formatter and whitespace checks.

Expected

  • Plugin tool descriptor caching remains context-safe and executable.
  • Cached execution works for factories registered without explicit names.

Actual

  • Focused tests and typechecks passed locally on the replacement branch.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Local verification from the source branch before replacement:

pnpm test src/plugins/tools.optional.test.ts
pnpm tsgo:core
pnpm tsgo:core:test
pnpm exec oxfmt --check --threads=1 src/plugins/tool-descriptor-cache.ts src/plugins/tools.ts src/plugins/tools.optional.test.ts
git diff --check

Human Verification (required)

  • Verified scenarios: descriptor cache hit path, sandbox-context cache separation, implicit-name factory execution fallback, core prod/test typechecks.
  • Edge cases checked: plugin factories returning null, cached wrappers reloading runtime on execution, manifest declaredNames fallback.
  • What you did not verify: full pnpm check:changed on Testbox after opening this replacement PR.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No): Yes
  • Config/env changes? (Yes/No): No
  • Migration needed? (Yes/No): No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: descriptor cache could advertise context-sensitive tools in the wrong context.
    • Mitigation: cache keys include the same request/runtime context dimensions as the old factory-result cache, with regression coverage for sandbox-sensitive factories.
  • Risk: cached execution could fail for factories registered without explicit names.
    • Mitigation: runtime entry lookup falls back to manifest declaredNames, with regression coverage.

Changed files

  • CHANGELOG.md (modified, +1/-1)
  • docs/plugins/building-plugins.md (modified, +5/-0)
  • docs/tools/index.md (modified, +6/-0)
  • src/plugins/tool-descriptor-cache.ts (added, +156/-0)
  • src/plugins/tool-factory-cache.ts (removed, +0/-102)
  • src/plugins/tools.optional.test.ts (modified, +62/-98)
  • src/plugins/tools.ts (modified, +338/-65)

Code Example

prep stages: runId=a569a121-... phase=stream-ready totalMs=90368
 core-plugin-tools: 78580ms ← THIS IS THE REAL PROBLEM
 bootstrap-context: 6439ms
 bundle-tools: 1707ms
 system-prompt: 2497ms

---

for (const entry of registry.tools) {
  // ...
  resolved = entry.factory(params.context); // Called EVERY TIME, no caching!
  // ...
}

---

// Conceptual fix
const factoryCache = new Map();
function resolvePluginTools(params) {
  for (const entry of registry.tools) {
    const cacheKey = `${entry.pluginId}:${entry.names.join(",")}`;
    if (!factoryCache.has(cacheKey)) {
      factoryCache.set(cacheKey, entry.factory(params.context));
    }
    resolved = factoryCache.get(cacheKey);
  }
}

---

console.time('resolvePluginTools');
const tools = resolvePluginTools(...);
console.timeEnd('resolvePluginTools');

---

PowerShell Invoke-RestMethod to DeepSeek: 941ms
Node fetch to DeepSeek: 916ms

---

workspace: 1ms
runtime-plugins: 2ms
hooks: 1ms
model-resolution: 26297ms
auth: 48433ms
context-engine: 1ms
attempt-dispatch: 41969ms
total: 116704ms

---

resolveProviderRuntimePlugin(deepseek) -> 16604ms (first call)
Repeated calls: ~12000ms each (no caching)

---

resolveModelAsync('deepseek', 'deepseek-v4-flash', ...) -> 28975ms

---

Windows detected - OpenClaw runs great on WSL2!
Native Windows might be trickier.
RAW_BUFFERClick to expand / collapse

Summary

OpenClaw on native Windows experiences severe Gateway slowdowns where even a simple message like 你好 takes ~90 seconds total, with ~78 seconds spent in core-plugin-tools stage. The root cause has been identified: plugin tool factory calls (entry.factory()) are executed on every request with no caching at all.

Environment

  • OpenClaw version: 2026.4.29 (a448042)
  • OS: Windows 10.0.26100 x64
  • CPU: 16 logical cores
  • RAM: 31.92 GiB
  • Gateway mode: local loopback, port 18789
  • Bundled providers: 57 plugins loaded

Updated Root Cause Analysis (2026-05-01)

The Real Bottleneck: core-plugin-tools Stage

prep stages: runId=a569a121-... phase=stream-ready totalMs=90368
 core-plugin-tools: 78580ms ← THIS IS THE REAL PROBLEM
 bootstrap-context: 6439ms
 bundle-tools: 1707ms
 system-prompt: 2497ms

The core-plugin-tools stage is defined in selection-CwAy0mf2.js line 6398, executed via createOpenClawCodingTools() in pi-tools-hI2Hhdd_.js.

Root Cause: Plugin Tool Factory Calls Are Not Cached

File: tools-CCfW25J2.js, function resolvePluginTools() (line 53-144)

for (const entry of registry.tools) {
  // ...
  resolved = entry.factory(params.context); // Called EVERY TIME, no caching!
  // ...
}

Every time a user sends a message:

  1. createOpenClawCodingTools() is called
  2. It calls createOpenClawTools()resolveOpenClawPluginToolsForOptions()
  3. Which calls resolvePluginTools() → iterates ALL plugin entries
  4. Each entry.factory() is executed synchronously and without caching

The result of entry.factory() (the actual tool objects) is never cached.

Why the Previous Patch Didn't Help

An earlier patch memoized resolveProviderRuntimePlugin(), but:

  • model-resolution: 662ms (already fast with patch)
  • auth: 10110ms (acceptable)
  • core-plugin-tools: 78580ms (the real problem - unchanged by patch)

The memoization was applied to the wrong function.

Impact of 57 Bundled Provider Plugins

This Windows environment has 57 bundled provider plugins loaded (alibaba, amazon-bedrock, anthropic, deepseek, google, groq, ...). Each plugin can register multiple tools via entry.factory(). With 57 plugins and no factory result caching, every message pays the full initialization cost.

Reproduction Steps

  1. Install and start OpenClaw on native Windows.
  2. Open the local Control UI: http://127.0.0.1:18789/
  3. Send a simple message such as: 你好
  4. Observe: first message takes ~90 seconds total (78s in core-plugin-tools)
  5. Send a second simple message
  6. Observe: second message ALSO takes ~78 seconds in core-plugin-tools (no improvement)

Expected vs Actual

ExpectedActual
First message< 10 seconds~90 seconds
Second message< 1 second (cached)~78 seconds (uncached)
core-plugin-tools< 1 second78580ms
node.list< 1 second53 seconds (blocked behind same thread)

Key Files

FileRole
selection-CwAy0mf2.js:6398prepStages.mark("core-plugin-tools")
pi-tools-hI2Hhdd_.js:806createOpenClawCodingTools()
openclaw-tools-DuqACH22.js:9322resolveOpenClawPluginToolsForOptions()
tools-CCfW25J2.js:53-144resolvePluginTools() - no caching of factory results

Suggested Fix Directions

Priority 1: Cache Plugin Tool Factory Results

The entry.factory() call results should be cached per (pluginId, toolName, contextHash):

// Conceptual fix
const factoryCache = new Map();
function resolvePluginTools(params) {
  for (const entry of registry.tools) {
    const cacheKey = `${entry.pluginId}:${entry.names.join(",")}`;
    if (!factoryCache.has(cacheKey)) {
      factoryCache.set(cacheKey, entry.factory(params.context));
    }
    resolved = factoryCache.get(cacheKey);
  }
}

Priority 2: Add Instrumentation to createOpenClawCodingTools

Add console.time/timeEnd around resolvePluginTools() to quantify per-plugin cost:

console.time('resolvePluginTools');
const tools = resolvePluginTools(...);
console.timeEnd('resolvePluginTools');

Priority 3: Investigate Lazy Tool Loading

Many of the 57 plugins provide tools that are rarely used. Consider:

  • Lazy initialization of plugin tools
  • Only resolving tools when actually called
  • Separating "tool definitions" from "tool instances"

Additional Context: Why This Affects UI Too

The Gateway is single-threaded. When core-plugin-tools blocks for 78 seconds on a message:

  • node.list RPC queues behind it
  • Web UI refreshes stall
  • All Gateway RPCs become unresponsive

This explains why both chat and node.list are slow simultaneously.

Previous Diagnostic Data

Direct provider calls are fast

PowerShell Invoke-RestMethod to DeepSeek: 941ms
Node fetch to DeepSeek: 916ms

This confirms the network path is fine - the problem is entirely local to OpenClaw's tool resolution.

Original startup stage timing

workspace: 1ms
runtime-plugins: 2ms
hooks: 1ms
model-resolution: 26297ms
auth: 48433ms
context-engine: 1ms
attempt-dispatch: 41969ms
total: 116704ms

Provider runtime plugin lookup timing

resolveProviderRuntimePlugin(deepseek) -> 16604ms (first call)
Repeated calls: ~12000ms each (no caching)

Model resolution timing

resolveModelAsync('deepseek', 'deepseek-v4-flash', ...) -> 28975ms

Platform Note

This issue manifests severely on native Windows, but the no-cache design is cross-platform code. The architectural problem exists on Linux/macOS too, but may be less noticeable due to faster file system and module loading.

OpenClaw itself prints:

Windows detected - OpenClaw runs great on WSL2!
Native Windows might be trickier.

Related

  • Original diagnostic report identified resolveProviderRuntimePlugin memoization as a potential fix, but the real bottleneck is in resolvePluginTools() factory calls.
  • Experiment 2 (memoizing resolveProviderRuntimePlugin result) showed improvement for repeated calls: call=1 ms=18087, call=2 ms=0, call=3 ms=0, but this does not address the core-plugin-tools stage.

extent analysis

TL;DR

Cache the results of entry.factory() calls in resolvePluginTools() to prevent redundant computations and improve performance.

Guidance

  1. Implement caching: Modify resolvePluginTools() to cache the results of entry.factory() calls using a data structure like a Map, with a cache key based on pluginId, toolName, and contextHash.
  2. Verify cache effectiveness: Add instrumentation (e.g., console.time and console.timeEnd) around resolvePluginTools() to measure the performance improvement after caching is implemented.
  3. Investigate lazy tool loading: Consider loading plugin tools lazily or only when they are actually needed to further reduce the performance impact of the core-plugin-tools stage.
  4. Monitor performance: Keep an eye on the performance of the core-plugin-tools stage after implementing caching and lazy loading to ensure the fixes are effective and do not introduce new issues.

Example

const factoryCache = new Map();
function resolvePluginTools(params) {
  for (const entry of registry.tools) {
    const cacheKey = `${entry.pluginId}:${entry.names.join(",")}`;
    if (!factoryCache.has(cacheKey)) {
      factoryCache.set(cacheKey, entry.factory(params.context));
    }
    resolved = factoryCache.get(cacheKey);
  }
}

Notes

The provided caching solution is a conceptual example and might need adjustments based on the actual implementation details of resolvePluginTools() and the entry.factory() method. Additionally, the effectiveness of caching and lazy loading may vary depending on the specific use case and plugin behavior.

Recommendation

Apply the caching workaround to resolvePluginTools() to address the performance issue in the core-plugin-tools stage, as it directly targets the identified root cause of the problem.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING