openclaw - 💡(How to fix) Fix Per-turn plugin re-resolution: ~28–30s pre-model stall every turn on 2026.5.22 (~1.7M fs stats/turn)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On 2026.5.22, every gateway turn has a consistent ~28–30s gap between before_dispatch and llm_input (i.e. before the model is even called). Profiling shows the gateway re-resolves the entire plugin tree on every turn — ~1.7M filesystem stats in an 18s window. Responsiveness regressed noticeably after updating to 2026.5.22.

Root Cause

2026.5.24-beta.2 release notes mention "Reuses process-stable channel catalog reads and caches plugin metadata snapshots to reduce repeated file stats" and "Lazy-loads startup-idle plugin work and ACPX runtime" — which appear to directly target this. Filing per maintainer request because the fs_usage numbers are strong evidence of the regressed hot path in 2026.5.22.

Fix Action

Fix / Workaround

On 2026.5.22, every gateway turn has a consistent ~28–30s gap between before_dispatch and llm_input (i.e. before the model is even called). Profiling shows the gateway re-resolves the entire plugin tree on every turn — ~1.7M filesystem stats in an 18s window. Responsiveness regressed noticeably after updating to 2026.5.22.

before_dispatch → llm_input gap measured at 28 / 27 / 28 / 31 / 38 / 27s across turns. Once the model is called it's fast (agent_end durationMs = 5–13s). /status and /new are also slow. So the cost is entirely pre-model, gateway-side.

The env knobs OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS / OPENCLAW_PLUGIN_MANIFEST_CACHE_MS referenced in current source are not present in the 2026.5.22 build (grep: 0 references), so they aren't a workaround on stable.

Code Example

uv_run → uv__io_poll → uv__work_done → MakeLibuvRequestCallback<uv_fs_s>
       → node::fs::AfterNoArgsFSReqPromise::ResolveRunMicrotasks[JS]
RAW_BUFFERClick to expand / collapse

Summary

On 2026.5.22, every gateway turn has a consistent ~28–30s gap between before_dispatch and llm_input (i.e. before the model is even called). Profiling shows the gateway re-resolves the entire plugin tree on every turn — ~1.7M filesystem stats in an 18s window. Responsiveness regressed noticeably after updating to 2026.5.22.

Environment

  • openclaw 2026.5.22 (npm latest)
  • node v22.22.1, macOS 26.3 (arm64)
  • 19 active plugins, Pi runtime, one channel pinned to anthropic/claude-sonnet-4-6

Symptom

before_dispatch → llm_input gap measured at 28 / 27 / 28 / 31 / 38 / 27s across turns. Once the model is called it's fast (agent_end durationMs = 5–13s). /status and /new are also slow. So the cost is entirely pre-model, gateway-side.

Profiling

sample of the gateway during the gap — main thread is busy (not idle), under libuv fs callbacks running JS:

uv_run → uv__io_poll → uv__work_done → MakeLibuvRequestCallback<uv_fs_s>
       → node::fs::AfterNoArgs → FSReqPromise::Resolve → RunMicrotasks → [JS]

Hot frames: node::fs::LStat/Stat/Open/ExistsSync/FStat/Close, scandir/readdir/opendir/Glob, and v8::internal::JsonParser.

sudo fs_usage -w -f filesys <gateway-pid> during one turn:

  • ~1,686,175 filesystem events to node_modules in 18s
  • Concentrated on ~/.openclaw*/npm/node_modules/@openclaw/* (codex, acpx, discord, slack) and local extensions/ dirs
  • The plugin tree (~25K files) is effectively re-walked ~66× per turn
  • Project workspace files: only ~6.9K accesses (not the cause)

Ruled out

Model latency, claude-cli, disk/swap (53GB free, 0 swap), embeddings (warm, ~500ms, not per-turn), codex runtime, sessions dir, project workspace. The 1.7M stats are the OpenClaw plugin/extension tree.

Likely already fixed in beta

2026.5.24-beta.2 release notes mention "Reuses process-stable channel catalog reads and caches plugin metadata snapshots to reduce repeated file stats" and "Lazy-loads startup-idle plugin work and ACPX runtime" — which appear to directly target this. Filing per maintainer request because the fs_usage numbers are strong evidence of the regressed hot path in 2026.5.22.

The env knobs OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS / OPENCLAW_PLUGIN_MANIFEST_CACHE_MS referenced in current source are not present in the 2026.5.22 build (grep: 0 references), so they aren't a workaround on stable.

Questions

  1. Confirm this is the regression fixed by the snapshot-reuse commits in 2026.5.24-beta.x?
  2. Will the fix land in a 2026.5.2x stable, or is beta the path?
  3. For stable-only users, is downgrading to 2026.5.19 the recommended interim workaround?

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING