openclaw - 💡(How to fix) Fix loadPluginMetadataSnapshot rebuilds the full installed-plugin index synchronously on every call — 25–140s event-loop freezes on multi-agent gateways

openclaw2026-05-18 15:52:13

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

On a long-running gateway serving multiple agents, the Node event loop freezes for 25–140 seconds at a time, repeatedly. A CPU profile traced it to loadPluginMetadataSnapshot() running a full, synchronous installed-plugin filesystem crawl on every call — it has no memoization, and the one intended cache (getCurrentPluginMetadataSnapshot, a single slot) structurally misses on multi-agent gateways.

Root Cause

Dominant CPU-profile stack:

loadCatalog
└ normalizeProviderModelIdWithManifest      (manifest-model-id-normalization.ts)
 └ resolveMetadataSnapshotForPolicies
  └ loadPluginMetadataSnapshot               (plugin-metadata-snapshot.ts)
   └ loadPluginRegistrySnapshotWithMetadata
    └ loadInstalledPluginIndex → buildInstalledPluginIndex
     └ resolveInstalledPluginIndexRegistry → discoverOpenClawPlugins → discoverInDirectory

loadPluginMetadataSnapshot() has no memoization — every call runs loadPluginMetadataSnapshotImpl → loadInstalledPluginIndex, and loadInstalledPluginIndex() is literally return buildInstalledPluginIndex(params) (a full rebuild despite the name). buildInstalledPluginIndex synchronously discovers every installed plugin from disk: realpathSync/statSync/lstatSync/existsSync (~170s cumulative in a 41-minute profile), openSync+readSync of every plugin manifest, hashing each. ~65% of all non-idle CPU in the profile was this single path.

There is an intended cache: resolveMetadataSnapshotForPolicies first tries getCurrentPluginMetadataSnapshot() (a single-slot snapshot the gateway sets at startup) and only falls back to loadPluginMetadataSnapshot() on a miss. But getCurrentPluginMetadataSnapshot() only returns the slot when workspaceDir + config fingerprint + policyHash all match. A gateway serving N agents with distinct workspaces/configs can match at most one of them, so for every other agent the lookup misses and falls through to the uncached full crawl — on every turn. (loadCatalog normalizes several model ids per catalog load, so one turn crawls repeatedly.)

loadPluginMetadataSnapshot is imported at 30+ call sites; none memoizes.

Code Example

loadCatalog
└ normalizeProviderModelIdWithManifest      (manifest-model-id-normalization.ts)
 └ resolveMetadataSnapshotForPolicies
  └ loadPluginMetadataSnapshot               (plugin-metadata-snapshot.ts)
   └ loadPluginRegistrySnapshotWithMetadata
    └ loadInstalledPluginIndex → buildInstalledPluginIndex
     └ resolveInstalledPluginIndexRegistry → discoverOpenClawPlugins → discoverInDirectory

RAW_BUFFERClick to expand / collapse

Summary

Environment

openclaw 2026.5.12, Node v25.5.0, Linux x64
gateway serving 8 agents with distinct workspaces; ~150 bundled plugins

Symptom

Event loop blocked 25–140s (built-in liveness monitor: eventLoopDelayMaxMs up to ~138000, eventLoopUtilization ≈ 1, cpuCoreRatio ≈ 1). One instance logged 18 such freezes in 88 minutes of normal use.
While frozen the whole gateway is unresponsive: streaming model responses are not read, so embedded runs fail with LLM idle timeout (120s): no response from model; outbound HTTP (e.g. a feishu ping) hits axios timeouts.

Root cause

Dominant CPU-profile stack:

loadCatalog
└ normalizeProviderModelIdWithManifest      (manifest-model-id-normalization.ts)
 └ resolveMetadataSnapshotForPolicies
  └ loadPluginMetadataSnapshot               (plugin-metadata-snapshot.ts)
   └ loadPluginRegistrySnapshotWithMetadata
    └ loadInstalledPluginIndex → buildInstalledPluginIndex
     └ resolveInstalledPluginIndexRegistry → discoverOpenClawPlugins → discoverInDirectory

loadPluginMetadataSnapshot is imported at 30+ call sites; none memoizes.

Impact

Any multi-agent / multi-workspace gateway: the event loop is periodically frozen for tens of seconds to over two minutes, making the whole process unresponsive and causing spurious model idle-timeouts and dropped/late outbound requests.

Proposed fix

Memoize loadPluginMetadataSnapshot with a small, process-lifetime, multi-entry cache keyed on a cheap in-memory fingerprint (resolvePluginMetadataControlPlaneFingerprint + workspaceDir/stateDir/preferPersisted + installed-index fingerprint — no disk I/O), invalidated wherever the installed-plugin index changes (the existing clearCurrentPluginMetadataSnapshotState() sites in installed-plugin-index-store.ts, plus clearCurrentPluginMetadataSnapshot()).

This collapses the repeated crawls to one per distinct config/workspace and benefits all 30+ call sites. Tested on the affected deployment: recurring freezes eliminated (one ~26s crawl after a restart, then none).

A PR with this fix follows.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#output truncation #response parsing #generation error #database connection #vector store

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix loadPluginMetadataSnapshot rebuilds the full installed-plugin index synchronously on every call — 25–140s event-loop freezes on multi-agent gateways

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Environment

Symptom

Root cause

Impact

Proposed fix

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix loadPluginMetadataSnapshot rebuilds the full installed-plugin index synchronously on every call — 25–140s event-loop freezes on multi-agent gateways

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Environment

Symptom

Root cause

Impact

Proposed fix

Still need to ship something?

RELATED_DISCOVERY

TRENDING