openclaw - ✅(Solved) Fix resolvePluginWebProviders snapshot cache never hits because callers pass fresh config objects [2 pull requests, 4 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73730Fetched 2026-04-29 06:15:49
View on GitHub
Comments
4
Participants
4
Timeline
12
Reactions
0
Author
Timeline (top)
cross-referenced ×5commented ×4mentioned ×1referenced ×1

The web-fetch / web-search provider layer has a memoizeSnapshot WeakMap cache keyed by params.config. In practice the cache never hits because upstream callers construct a fresh config object on every call. Every message dispatch pays the full ~30 second web-search provider load.

Root Cause

src/plugins/web-provider-runtime-shared.ts caches via deps.snapshotCache.get(cacheOwnerConfig), where cacheOwnerConfig === params.config. The call site resolveWebSearchDefinition (in src/runtime/web-search-runtime.ts, approx.) calls resolveWebSearchRuntimeConfig(options?.config), which returns a new object each time.

Bumping the TTL via OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS / OPENCLAW_PLUGIN_MANIFEST_CACHE_MS env vars does NOT help because the lookup fails at the outer WeakMap step, before TTL is consulted.

Fix Action

Workaround

Not yet patched locally. A possible local patch would wrap resolvePluginWebSearchProviders / resolvePluginWebFetchProviders with a Map cache keyed by {workspaceDir, sortedOnlyPluginIds, origin, bundledAllowlistCompat}.

PR fix notes

PR #73847: fix(plugins): key web-provider snapshot cache on config-content fingerprint (#73730)

Description (problem / solution / changelog)

Fixes #73730. Refs #73729 and #73835 (sister/corroborating reports in the same lane).

Problem

`resolvePluginWebProviders` previously kept its snapshot cache as:

```ts WeakMap<OpenClawConfig, WeakMap<NodeJS.ProcessEnv, Map<string, Entry>>> ```

— keyed on `OpenClawConfig` object identity at the outer level. As reported in #73730 with full instrumentation, callers like `resolveWebSearchRuntimeConfig` and `resolveWebFetchRuntimeConfig` build a fresh `config` object per dispatch, so the outer `WeakMap.get(cacheOwnerConfig)` always missed even though the inner `cacheKey` string was identical (`load-miss key=0afb40389a fields={ws:".../workspace",scope:"b623e8",plg:"85d4c2",...}` repeated message-after-message). Every dispatch paid the full ~30s `loadOpenClawPlugins` cycle.

Three users reported variants of the same root cause:

  • #73730 poolside-ventures: web-provider snapshot WeakMap miss (this PR)
  • #73729 poolside-ventures: capability-provider full reload per message (sister bug, same root cause shape)
  • #73835 brokemac79: idle gateway high CPU/RSS, CPU profile points to repeated `loadOpenClawPlugins` → `mirrorBundledPluginRuntimeRoot`

Fix

Switch the snapshot cache from an identity-keyed nested `WeakMap` to a flat `Map<string, Entry>` keyed entirely on `buildWebProviderSnapshotCacheKey`. The cache key is extended to include a stable content fingerprint of the resolution-relevant `config.plugins` subset (allowlist, entries enabled state, per-plugin config — exactly what `loadPluginManifestRegistryForPluginRegistry` and `loadInstalledWebProviderManifestRecords` actually consume).

Equal-content fresh config objects now produce the same cache key and hit. Genuinely different configs produce different keys and stay isolated — no false-positive collisions.

The fingerprint computation itself is memoized by config-object identity (`WeakMap<config, hashString>`), so callers that share a reference pay the hash cost only once. Callers that build a fresh config per dispatch (the original failure mode) still pay one `hashJson` per call, but `hashJson` runs in microseconds versus `loadOpenClawPlugins` running in seconds — net wall-clock win is the same ~30s saved per dispatch the issue measured.

What changed

FileChange
`web-provider-resolution-shared.ts`Added `fingerprintWebProviderResolutionConfig` helper + extended `buildWebProviderSnapshotCacheKey` to include the fingerprint
`web-provider-runtime-shared.ts`Changed `WebProviderSnapshotCache` type from `WeakMap<config, WeakMap<env, Map<key, Entry>>>` to `Map<string, Entry>`, simplified the lookup/store sites accordingly. Dropped the no-longer-needed `OpenClawConfig` type import.
`web-provider-runtime-shared.test.ts`Two new regression tests
`CHANGELOG.md`Unreleased Fixes line citing #73730 + #73729 + #73835

Tests

``` pnpm vitest run src/plugins/web-provider-runtime-shared.test.ts → 5 passed (3 existing + 2 new)

pnpm vitest run src/plugins/web-provider-runtime-shared.test.ts \ src/plugins/web-provider-resolution-shared.test.ts \ src/plugins/web-fetch-providers.runtime.test.ts \ src/plugins/web-search-providers.runtime.test.ts → 32 passed (30 existing + 2 new, no regressions across the four files) ```

The two new regression tests:

  1. Fresh-but-equal-content configs hit the cache — exercises the exact #73730 path: build a new `config` object reference per call with identical content; assert `loadOpenClawPlugins` is invoked once across two calls (pre-fix: twice).
  2. Content-different configs miss the cache — invariant guard: `{ plugins: { entries: { brave: { enabled: true } } } }` and `{ plugins: { entries: { brave: { enabled: false } } } }` produce different fingerprints and both calls miss the cache.

Why this shape over the alternatives in my earlier triage comment

In #73730 I proposed two shapes: (1) identity-intern the resolved config, (2) hash config content into the cache key. This PR implements (2) because:

  • Interning would fight `OpenClawConfig` mutation, which several gateway paths perform (config reloads, cron edits)
  • The fingerprint approach has a clean invalidation story: edit any `config.plugins.entries[*]` field → hash differs → next call misses → cache repopulates with the new state
  • TTL eviction (`resolvePluginSnapshotCacheTtlMs`) was already part of the contract, so the `Map` size is already bounded

If maintainer prefers shape (1) instead, happy to rebase. The diff for shape (1) would be smaller but trickier to invalidate.

Why this is one-shot for #73730 only

#73729 (capability-provider) and #73835 (gateway prewarm) share the same root-cause shape but live in different files. Folding all three into one PR would cross 400 LOC and 3+ separable concerns. This PR fixes #73730 with the smallest possible scope; the same content-fingerprint pattern can be applied to the capability-provider cache (`capability-provider-runtime.ts`) and the gateway prewarm path as follow-ups if the maintainer agrees with the direction here.

🦞 lobster-biscuit


Sign-Off: hclsys

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/plugins/web-provider-resolution-shared.test.ts (modified, +51/-0)
  • src/plugins/web-provider-resolution-shared.ts (modified, +63/-0)
  • src/plugins/web-provider-runtime-shared.test.ts (modified, +166/-0)
  • src/plugins/web-provider-runtime-shared.ts (modified, +66/-30)

PR #73853: [AI-assisted] fix(plugins): reduce startup provider registry reloads

Description (problem / solution / changelog)

Fixes #73835. Fixes #73729. Refs #73730. Refs #73847.

Summary

This follows @hclsys's #73847, which covers the #73730 web-provider snapshot cache path. This PR intentionally leaves that web-provider work out and focuses on the remaining repeated plugin-registry load surfaces reported in #73835 and #73729.

  • Keep gateway startup primary-model prewarm on provider-discovery entries only, with the active workspace passed through so startup metadata snapshots can be reused instead of falling through to full plugin runtime loads.
  • Thread the entry-only provider discovery mode through models.json planning and fingerprinting so cache entries remain distinct from full discovery.
  • Scope capability-provider fallback registry loads to the manifest-derived bundled owner plugins, avoiding broad image/video/music snapshot loads during tool setup.

Issue Context

#73835's CPU profile points at startup/model prewarm repeatedly reaching loadOpenClawPlugins and bundled runtime mirror refresh work. #73729 reports the related capability-provider path where image, video, and music provider listing can trigger repeated full registry loads. #73730 is covered by @hclsys's #73847, and this PR is meant to complement that teamwork effort rather than duplicate it.

AI Assistance

AI-assisted with Codex. The implementation was driven from #73835, the reporter-provided Discord guidance, and the linked #73729/#73730 discussion.

Tests

  • node scripts/test-projects.mjs src/gateway/server-startup.test.ts src/gateway/server-startup-post-attach.test.ts src/agents/models-config.providers.implicit.discovery-scope.test.ts src/plugins/provider-discovery.runtime.test.ts src/plugins/capability-provider-runtime.test.ts src/plugins/web-provider-runtime-shared.test.ts src/plugins/web-search-providers.runtime.test.ts src/plugins/web-fetch-providers.runtime.test.ts -> passed 3 Vitest shards, 8 files, 72 tests
  • corepack pnpm test:contracts:plugins -> 56 files, 751 tests passed
  • node .\node_modules\@typescript\native-preview\bin\tsgo.js -p tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test-cpu-prewarm.tsbuildinfo -> passed
  • git diff --check -> passed

Contributor Checklist Notes

Update: addressed Greptile P2 in 66e9f93f.

  • Backend/runtime change; no screenshots applicable.
  • I did not run full pnpm build && pnpm check && pnpm test; instead I ran the focused affected shards, plugin contract lane, and TypeScript test project check above.
  • Attempted local Codex review per CONTRIBUTING.md, but the Windows app execution alias failed with Access is denied for both codex review --base origin/main and codex review --uncommitted.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/models-config.plan.ts (modified, +7/-0)
  • src/agents/models-config.providers.implicit.discovery-scope.test.ts (modified, +16/-0)
  • src/agents/models-config.providers.implicit.ts (modified, +2/-0)
  • src/agents/models-config.ts (modified, +10/-0)
  • src/gateway/server-startup-post-attach.test.ts (modified, +10/-0)
  • src/gateway/server-startup-post-attach.ts (modified, +11/-0)
  • src/gateway/server-startup.test.ts (modified, +9/-0)
  • src/plugins/capability-provider-runtime.test.ts (modified, +77/-0)
  • src/plugins/capability-provider-runtime.ts (modified, +32/-3)
  • src/plugins/provider-discovery.runtime.test.ts (modified, +11/-0)

Code Example

msg 1: load-miss key=0afb40389a fields={ws:".../workspace",scope:"b623e8",plg:"85d4c2",...}
load-done key=0afb40389a elapsedMs=30827

msg 2: load-miss key=0afb40389a fields={ws:".../workspace",scope:"b623e8",plg:"85d4c2",...}
load-done key=0afb40389a elapsedMs=31291
RAW_BUFFERClick to expand / collapse

Environment

  • openclaw 2026.4.26

Summary

The web-fetch / web-search provider layer has a memoizeSnapshot WeakMap cache keyed by params.config. In practice the cache never hits because upstream callers construct a fresh config object on every call. Every message dispatch pays the full ~30 second web-search provider load.

Reproduction

  1. Send any message
  2. Observe the load-miss key=0afb40389a event for the web-search provider load
  3. Send a second message within 5 minutes
  4. Observe the same load-miss key=0afb40389a again — with identical fields fingerprint, proving it should have hit the cache
msg 1: load-miss key=0afb40389a fields={ws:".../workspace",scope:"b623e8",plg:"85d4c2",...}
load-done key=0afb40389a elapsedMs=30827

msg 2: load-miss key=0afb40389a fields={ws:".../workspace",scope:"b623e8",plg:"85d4c2",...}
load-done key=0afb40389a elapsedMs=31291

Cache should have hit — cacheKey (string) is identical. It does not because the outer WeakMap<config, ...> lookup at resolvePluginWebProviders L93 fails: the params.config object has a different reference each call.

Root Cause

src/plugins/web-provider-runtime-shared.ts caches via deps.snapshotCache.get(cacheOwnerConfig), where cacheOwnerConfig === params.config. The call site resolveWebSearchDefinition (in src/runtime/web-search-runtime.ts, approx.) calls resolveWebSearchRuntimeConfig(options?.config), which returns a new object each time.

Bumping the TTL via OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS / OPENCLAW_PLUGIN_MANIFEST_CACHE_MS env vars does NOT help because the lookup fails at the outer WeakMap step, before TTL is consulted.

Suggested Fix

Replace the WeakMap<config, ...> with a Map<string, ...> keyed by a stable hash derived from config content (workspaceDir, plugins config, env snapshot, onlyPluginIds, origin), or alternatively deduplicate the cacheOwnerConfig object identity at a higher level so repeated calls share the same reference.

Workaround

Not yet patched locally. A possible local patch would wrap resolvePluginWebSearchProviders / resolvePluginWebFetchProviders with a Map cache keyed by {workspaceDir, sortedOnlyPluginIds, origin, bundledAllowlistCompat}.

Impact

~30s per message for any agent that resolves web-search tools.

Related: #73729, #73728

extent analysis

TL;DR

Replace the WeakMap<config, ...> with a Map<string, ...> keyed by a stable hash derived from config content to fix the cache miss issue.

Guidance

  • Identify the resolveWebSearchRuntimeConfig function in src/runtime/web-search-runtime.ts and verify that it returns a new object each time, causing the cache lookup to fail.
  • Consider implementing a stable hash function to derive a key from the config content, such as workspaceDir, plugins config, env snapshot, onlyPluginIds, and origin.
  • Evaluate the suggested fix of replacing the WeakMap with a Map<string, ...> and assess the potential impact on the system.
  • As a temporary workaround, explore wrapping resolvePluginWebSearchProviders / resolvePluginWebFetchProviders with a Map cache keyed by a relevant set of parameters, such as {workspaceDir, sortedOnlyPluginIds, origin, bundledAllowlistCompat}.

Example

// Example of a stable hash function
function getStableHash(config: any) {
  const hash = crypto.createHash('sha256');
  hash.update(JSON.stringify({
    workspaceDir: config.workspaceDir,
    plugins: config.plugins,
    env: config.env,
    onlyPluginIds: config.onlyPluginIds,
    origin: config.origin,
  }));
  return hash.digest('hex');
}

Notes

The provided solution assumes that the config object contains the necessary information to derive a stable hash. Additional investigation may be required to ensure that the chosen parameters are sufficient for generating a unique key.

Recommendation

Apply the suggested fix of replacing the WeakMap with a Map<string, ...> keyed by a stable hash derived from config content, as it addresses the root cause of the issue and provides a more robust solution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING