openclaw - 💡(How to fix) Fix Gateway latency and idle CPU from repeated plugin manifest scans during auth/provider resolution

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Hi OpenClaw team, thanks for the ongoing gateway performance work.

This issue started as a repeated openai-codex auth-profile debug log every 30 seconds, but further testing shows a larger performance problem behind it.

The debug log is only the visible symptom. The important part is that the gateway repeatedly enters auth/provider/plugin resolution and performs many synchronous filesystem checks against plugin manifests and registry data. On a Discord gateway this can delay the typing indicator and delay the first model request.

Root Cause

A local workaround was applied to confirm the root cause. The workaround is not proposed as-is for upstream, but it is useful evidence.

Fix Action

Fix / Workaround

Impact before local workaround

Local workaround result

A local workaround was applied to confirm the root cause. The workaround is not proposed as-is for upstream, but it is useful evidence.

Code Example

[agents/auth-profiles] {"profileId":"openai-codex:default","provider":"openai-codex"} kept local oauth over external cli bootstrap-only provider

---

loadOpenClawPlugins.enter
  stack=resolveRuntimePluginRegistry
    <- getRuntimePluginRegistryForLoadOptions
    <- resolvePluginProviders
    <- resolveProviderPluginsForHooks
    <- resolveExternalAuthProfilesWithPlugins

loadOpenClawPlugins.cacheCheck hit=HIT
[agents/auth-profiles] kept local oauth over external cli bootstrap-only provider

---

70123 statx
12380 access
 8380 openat
 8305 close
 6977 read
109712 total syscalls

---

/usr/lib/node_modules/openclaw/dist/extensions
/usr/lib/node_modules/openclaw/dist/extensions/*/openclaw.plugin.json
/root/.openclaw/plugins/flybridge/openclaw.plugin.json
/root/.openclaw/npm/node_modules

---

/usr/lib/node_modules/openclaw/dist/extensions/mattermost/openclaw.plugin.json
/usr/lib/node_modules/openclaw/dist/extensions/matrix/openclaw.plugin.json
/usr/lib/node_modules/openclaw/dist/extensions/slack/openclaw.plugin.json
/usr/lib/node_modules/openclaw/dist/extensions/anthropic/openclaw.plugin.json

---

fs.statSync(path.join(pluginDir, "openclaw.plugin.json"))
RAW_BUFFERClick to expand / collapse

Summary

Hi OpenClaw team, thanks for the ongoing gateway performance work.

This issue started as a repeated openai-codex auth-profile debug log every 30 seconds, but further testing shows a larger performance problem behind it.

The debug log is only the visible symptom. The important part is that the gateway repeatedly enters auth/provider/plugin resolution and performs many synchronous filesystem checks against plugin manifests and registry data. On a Discord gateway this can delay the typing indicator and delay the first model request.

Environment

  • OpenClaw: 2026.5.7
  • Runtime: Node.js 24.x
  • OS: Linux VPS
  • Install method: npm global
  • Gateway: long-running headless gateway with Discord enabled
  • Relevant profile: openai-codex:default

Original visible symptom

While the gateway was idle, this debug line appeared every 30 seconds:

[agents/auth-profiles] {"profileId":"openai-codex:default","provider":"openai-codex"} kept local oauth over external cli bootstrap-only provider

The decision itself looks correct. Since openai-codex is bootstrapOnly, OpenClaw keeps the local OAuth profile instead of replacing it with external Codex CLI credentials.

The surprising part is that the same unchanged decision was repeated while no user request or agent run was active.

Local diagnostic evidence

Latency probes showed that the 30 second auth-profile tick enters provider/plugin resolution:

loadOpenClawPlugins.enter
  stack=resolveRuntimePluginRegistry
    <- getRuntimePluginRegistryForLoadOptions
    <- resolvePluginProviders
    <- resolveProviderPluginsForHooks
    <- resolveExternalAuthProfilesWithPlugins

loadOpenClawPlugins.cacheCheck hit=HIT
[agents/auth-profiles] kept local oauth over external cli bootstrap-only provider

So the top-level plugin loader cache may be a HIT, but the gateway still does repeated work around provider/plugin resolution.

Filesystem evidence

A short strace sample of the gateway process showed heavy synchronous filesystem activity:

70123 statx
12380 access
 8380 openat
 8305 close
 6977 read
109712 total syscalls

The repeated paths were plugin roots and plugin manifests, for example:

/usr/lib/node_modules/openclaw/dist/extensions
/usr/lib/node_modules/openclaw/dist/extensions/*/openclaw.plugin.json
/root/.openclaw/plugins/flybridge/openclaw.plugin.json
/root/.openclaw/npm/node_modules

Several bundled plugin manifests were touched roughly 100+ times in a 5 second sample, including:

/usr/lib/node_modules/openclaw/dist/extensions/mattermost/openclaw.plugin.json
/usr/lib/node_modules/openclaw/dist/extensions/matrix/openclaw.plugin.json
/usr/lib/node_modules/openclaw/dist/extensions/slack/openclaw.plugin.json
/usr/lib/node_modules/openclaw/dist/extensions/anthropic/openclaw.plugin.json

One local hotspot is listOpenClawPluginManifestMetadata(), where the cache key calls manifestFileFingerprint(candidate.pluginDir) for each candidate. That function synchronously stats each manifest:

fs.statSync(path.join(pluginDir, "openclaw.plugin.json"))

Other related hot paths appear to be plugin discovery, manifest loading, and persisted registry validation.

Impact before local workaround

On this system, simple Discord ping messages were very slow:

  • Typing indicator could take 15+ seconds to appear.
  • Simple ping -> pong replies could take 30-60 seconds.
  • Gateway CPU often sat around 20-60% of one vCPU while idle or near-idle.
  • The model request itself was not the only delay. A large part happened before the model request started.

Local workaround result

A local workaround was applied to confirm the root cause. The workaround is not proposed as-is for upstream, but it is useful evidence.

Changes tested locally:

  • Disabled broad DEBUG=* logging.
  • Gated local latency probe logs behind an explicit env flag.
  • Added short TTL/coalescing caches around plugin manifest metadata, plugin discovery, plugin registry snapshot, and plugin manifest loading.
  • Added a local NODE_OPTIONS --require filesystem-cache shim for OpenClaw plugin paths.
  • Set Discord typing mode to instant for this deployment.

Result:

  • Claude/Flybridge Discord ping now shows typing in about 1-2 seconds.
  • Simple pong replies now arrive in under 5 seconds.
  • OpenAI/Codex-runtime channels are still a little slower, but much faster than before.
  • Gateway CPU after settling is closer to 8-10% in local 30 second samples.

This confirms that repeated plugin/provider/registry work was a major contributor to the latency.

What seems upstream-worthy

The local workaround includes some VPS-specific pieces that should not be copied directly into OpenClaw.

The upstream-friendly parts appear to be:

  • Avoid re-statting every plugin manifest on cache-hit paths.
  • Add short TTL or coalescing for plugin manifest metadata and discovery snapshots.
  • Cache or coalesce loadPluginRegistrySnapshotWithMetadata() validation work.
  • Avoid repeating the bootstrap-only auth-profile decision every 30 seconds when relevant auth/config state has not changed.
  • Log the bootstrap-only auth-profile decision only on state change, or keep it at debug level without triggering expensive repeated work.

Expected behavior

For an idle gateway, provider/auth polling should not repeatedly scan all bundled and local plugin manifests.

For a bootstrap-only provider with an existing valid local OAuth profile, the "local OAuth wins" decision should be reused until relevant auth/profile/config state changes.

Related issues / context

This appears related to:

  • #80129: auth-profile debug log flood for external CLI bootstrap paths
  • #80148: stale non-bootstrapOnly external CLI credentials
  • #56960: openai-codex auth probing / retry loops causing gateway degradation
  • #78100 / #80345: plugin discovery and sync filesystem work affecting gateway responsiveness

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

For an idle gateway, provider/auth polling should not repeatedly scan all bundled and local plugin manifests.

For a bootstrap-only provider with an existing valid local OAuth profile, the "local OAuth wins" decision should be reused until relevant auth/profile/config state changes.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Gateway latency and idle CPU from repeated plugin manifest scans during auth/provider resolution