openclaw - 💡(How to fix) Fix Plugin discovery in infinite loop after upgrade — gateway pegs one CPU core continuously (v2026.4.25) [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73088Fetched 2026-04-28 06:27:42
View on GitHub
Comments
2
Participants
3
Timeline
5
Reactions
1
Author
Timeline (top)
commented ×2closed ×1cross-referenced ×1subscribed ×1

After upgrading from 2026.2.21 to 2026.4.25, the gateway process sustains ~100% CPU on one core indefinitely while idle (no chat activity, no commands). perf and strace show the gateway is in a tight loop continuously rescanning all bundled plugin manifests on disk — the new "cold registry" cache appears to be computed but never used as a hit, causing every operation to re-discover all ~150 extension plugins from scratch.

Root Cause

After upgrading from 2026.2.21 to 2026.4.25, the gateway process sustains ~100% CPU on one core indefinitely while idle (no chat activity, no commands). perf and strace show the gateway is in a tight loop continuously rescanning all bundled plugin manifests on disk — the new "cold registry" cache appears to be computed but never used as a hit, causing every operation to re-discover all ~150 extension plugins from scratch.

Fix Action

Fix / Workaround

  • Disk I/O — iowait is 0%, read_bytes is 0 (all from page cache)
  • Telegram retry storm — telegram is connected and idle
  • Bonjour plugin (separate crash bug fixed by upgrade — bonjour no longer loads in 2026.4.25 on our config)
  • Workspace size — workspace is small (1.8 MB)
  • Memory dir scan — memory/ is only 37 MB and doesn't change on idle
  • Config reloads — confirmed by checking [reload] log entries (none during the busy period)

Code Example

82% uv_run → uv__io_poll → uv__work_done →
    node::fs::FileHandle::CloseReq::Resolve    MicrotaskQueue::PerformCheckpoint    AsyncFunctionAwaitResolveClosure    Builtins_InterpreterEntryTrampoline (deeply nested)

99% uv_run → uv__io_poll → uv__work_done →
    CompressionStream<ZlibContext>::AfterThreadPoolWork

---

# Phase 1: scan all plugin manifests (no follow, NOFOLLOW)
openat(.../dist/extensions/acpx/openclaw.plugin.json, O_RDONLY|O_NOFOLLOW) = 29
openat(.../dist/extensions/active-memory/openclaw.plugin.json, O_RDONLY|O_NOFOLLOW) = 29
openat(.../dist/extensions/alibaba/openclaw.plugin.json, O_RDONLY|O_NOFOLLOW) = 29
... (~150 entries from acpx through zalouser) ...

# Phase 2: re-resolve installs and package.json walk
openat(.../.openclaw/plugins/installs.json, O_RDONLY) = 29
openat(.../openclaw/dist/package.json, O_RDONLY) = -1 ENOENT
openat(.../openclaw/package.json, O_RDONLY) = 29
openat(/home/ubuntu/package.json, O_RDONLY) = -1 ENOENT
openat(/home/package.json, O_RDONLY) = -1 ENOENT
openat(/package.json, O_RDONLY) = -1 ENOENT

# Phase 3: same 150 plugin manifests AGAIN (with package.json siblings)
openat(.../dist/extensions/acpx/package.json, O_RDONLY|O_NOFOLLOW) = 29
openat(.../dist/extensions/acpx/openclaw.plugin.json, O_RDONLY|O_NOFOLLOW) = 29
openat(.../dist/extensions/acpx/index.js, O_RDONLY|O_NOFOLLOW) = 29
...

# Then back to Phase 1.

---

rchar:        11_663_646_159   (~11.7 GB read through syscalls)
wchar:           415_063_619   (~415 MB written through syscalls)
syscr:           3_988_801      (~4M read calls)
syscw:             116_916
read_bytes:               0    (everything served from page cache)
write_bytes:    510_619_648

---

54M    agents
500M   browser
1.1G   plugin-runtime-deps     # 10 missing every restart, see below
37M    memory
52K    plugins                 # only installs.json
... (no `registry/` directory exists)

---

[gateway] [plugins] staging bundled runtime deps before gateway startup (10 missing, 10 install specs):
  @grammyjs/runner@^2.0.3, @grammyjs/transformer-throttler@^1.2.1,
  @modelcontextprotocol/sdk@1.29.0, commander@^14.0.3, express@5.2.1,
  grammy@^1.42.0, playwright-core@1.59.1, typebox@1.1.33,
  undici@8.1.0, ws@^8.20.0
[gateway] [plugins] installed bundled runtime deps before gateway startup in 4546ms
RAW_BUFFERClick to expand / collapse

Plugin discovery in infinite loop after upgrade — gateway pegs one CPU core continuously (v2026.4.25)

Summary

After upgrading from 2026.2.21 to 2026.4.25, the gateway process sustains ~100% CPU on one core indefinitely while idle (no chat activity, no commands). perf and strace show the gateway is in a tight loop continuously rescanning all bundled plugin manifests on disk — the new "cold registry" cache appears to be computed but never used as a hit, causing every operation to re-discover all ~150 extension plugins from scratch.

Environment

  • OpenClaw: 2026.4.25 (aa36ee6), installed via npm install -g openclaw@latest
  • Node: /usr/bin/node (system Node)
  • OS: Ubuntu 24.04 LTS, ARM64 (Oracle Cloud VM.Standard.A1.Flex, 4 OCPUs, 24 GB RAM)
  • Install path: /home/ubuntu/.npm-global/lib/node_modules/openclaw
  • Active plugins (per [gateway] ready): browser, telegram — only 2
  • Bundled plugins on disk: ~150 (dist/extensions/*/openclaw.plugin.json)
  • Was working fine on: 2026.2.21-2

Symptoms

  • Gateway process steady ~100–120% CPU (one full core) while idle
  • Same behavior immediately after systemctl --user restart openclaw-gateway, with no chat traffic
  • Dashboard responses observed at ~25 s latency during the busy period
  • top shows openclaw-gatewa consistently at the top of CPU
  • Memory is fine (~300–600 MB RSS), I/O wait is 0% (cache hits the page cache, but burns CPU)

Diagnostic data

perf record (10 s sample, hottest stack)

Two hot paths dominate:

82% uv_run → uv__io_poll → uv__work_done →
    node::fs::FileHandle::CloseReq::Resolve →
    MicrotaskQueue::PerformCheckpoint →
    AsyncFunctionAwaitResolveClosure →
    Builtins_InterpreterEntryTrampoline (deeply nested)

99% uv_run → uv__io_poll → uv__work_done →
    CompressionStream<ZlibContext>::AfterThreadPoolWork

So nearly all CPU is going through fs::FileHandle::Close resolution and zlib compression callbacks running through async microtasks.

strace -e openat (steady state, head of capture)

The same sequence repeats indefinitely. One iteration looks like:

# Phase 1: scan all plugin manifests (no follow, NOFOLLOW)
openat(.../dist/extensions/acpx/openclaw.plugin.json, O_RDONLY|O_NOFOLLOW) = 29
openat(.../dist/extensions/active-memory/openclaw.plugin.json, O_RDONLY|O_NOFOLLOW) = 29
openat(.../dist/extensions/alibaba/openclaw.plugin.json, O_RDONLY|O_NOFOLLOW) = 29
... (~150 entries from acpx through zalouser) ...

# Phase 2: re-resolve installs and package.json walk
openat(.../.openclaw/plugins/installs.json, O_RDONLY) = 29
openat(.../openclaw/dist/package.json, O_RDONLY) = -1 ENOENT
openat(.../openclaw/package.json, O_RDONLY) = 29
openat(/home/ubuntu/package.json, O_RDONLY) = -1 ENOENT
openat(/home/package.json, O_RDONLY) = -1 ENOENT
openat(/package.json, O_RDONLY) = -1 ENOENT

# Phase 3: same 150 plugin manifests AGAIN (with package.json siblings)
openat(.../dist/extensions/acpx/package.json, O_RDONLY|O_NOFOLLOW) = 29
openat(.../dist/extensions/acpx/openclaw.plugin.json, O_RDONLY|O_NOFOLLOW) = 29
openat(.../dist/extensions/acpx/index.js, O_RDONLY|O_NOFOLLOW) = 29
...

# Then back to Phase 1.

The fd = 29 reused on every line confirms each file is opened, read, and closed before the next — there's no batching.

The package.json walk (/home/ubuntu/package.json, /home/package.json, /package.json — all ENOENT) repeats hundreds of times. This looks like Node's module resolution algorithm being invoked in the hot loop instead of being cached.

Process I/O counters

After ~30 minutes of idle:

rchar:        11_663_646_159   (~11.7 GB read through syscalls)
wchar:           415_063_619   (~415 MB written through syscalls)
syscr:           3_988_801      (~4M read calls)
syscw:             116_916
read_bytes:               0    (everything served from page cache)
write_bytes:    510_619_648

11.7 GB of in-memory rereads vs 510 MB of actual writes — strong signal that the same data is being re-read in a loop without caching.

Filesystem state (~/.openclaw/)

54M    agents
500M   browser
1.1G   plugin-runtime-deps     # 10 missing every restart, see below
37M    memory
52K    plugins                 # only installs.json
... (no `registry/` directory exists)

Notably absent: a registry/ directory. The "cold registry path" referenced in the changelog appears to never be persisted on this install.

Bundled-deps re-stage on every restart

Every single restart logs:

[gateway] [plugins] staging bundled runtime deps before gateway startup (10 missing, 10 install specs):
  @grammyjs/runner@^2.0.3, @grammyjs/transformer-throttler@^1.2.1,
  @modelcontextprotocol/[email protected], commander@^14.0.3, [email protected],
  grammy@^1.42.0, [email protected], [email protected],
  [email protected], ws@^8.20.0
[gateway] [plugins] installed bundled runtime deps before gateway startup in 4546ms

plugin-runtime-deps is 1.1 GB on disk and clearly contains these packages, but the cache check reports them as missing every time. This isn't the cause of the CPU loop, but it points to the same underlying class of bug — cache existence checks not finding what's already cached.

Reproduction

  1. Fresh ARM64 Ubuntu 24.04 instance.
  2. npm install -g [email protected].
  3. Configure gateway with 2 active plugins (browser, telegram).
  4. systemctl --user start openclaw-gateway.
  5. Observe: one CPU core pegged ~100% with no client traffic.

Probably reproduces on any host with a complete bundled-extensions tree — the discovery scan doesn't depend on which extensions are enabled, it scans all that exist in dist/extensions/.

Hypothesis

The new plugin registry rewrite (per changelog: "migrate the local plugin registry automatically during package install/update, keeping install metadata in the plugin index while indexing existing plugin manifests for the new cold registry path") doesn't take the cache-hit path on npm install -g upgrades. Instead, every operation that needs the plugin list falls back to the full disk scan, and something in the gateway runtime (heartbeat? memory sync? plugin reconciler? config watcher?) calls the plugin lookup continuously.

The simultaneous CompressionStream hot path suggests the same loop also touches a code path that gzip-encodes/decodes the discovered manifest list for caching or transport.

What's already been ruled out

  • Disk I/O — iowait is 0%, read_bytes is 0 (all from page cache)
  • Telegram retry storm — telegram is connected and idle
  • Bonjour plugin (separate crash bug fixed by upgrade — bonjour no longer loads in 2026.4.25 on our config)
  • Workspace size — workspace is small (1.8 MB)
  • Memory dir scan — memory/ is only 37 MB and doesn't change on idle
  • Config reloads — confirmed by checking [reload] log entries (none during the busy period)

Asks

  1. Is there a manual command to bootstrap/repair the cold registry index? openclaw doctor and similar were tried; nothing visibly initializes a registry/ directory.
  2. Is there a way to disable bundled extensions we don't use, so the scan list is small (e.g., 2 instead of 150)?
  3. Is there an env var or config flag to force the cold-registry hit path so we can confirm whether that is the bug?

Happy to provide more diagnostics — full strace, perf data, journal dumps, etc.

extent analysis

TL;DR

The issue can be mitigated by manually bootstrapping or repairing the cold registry index, which seems to be the root cause of the infinite loop and high CPU usage.

Guidance

  1. Verify the absence of the registry/ directory: Check if the registry/ directory exists in the ~/.openclaw/ path. If it doesn't exist, it might be the cause of the issue.
  2. Try to manually bootstrap the cold registry index: Although there is no clear command provided in the issue, you can try to manually create the registry/ directory and populate it with the necessary files to see if it resolves the issue.
  3. Disable unused bundled extensions: If possible, try to disable the unused bundled extensions to reduce the scan list and see if it improves the performance.
  4. Check for any environment variables or config flags: Look for any environment variables or config flags that can force the cold-registry hit path to confirm if it's the root cause of the issue.

Example

No code snippet is provided as it's not clearly supported by the issue.

Notes

The issue seems to be related to the new plugin registry rewrite, and the absence of the registry/ directory might be the root cause. However, without more information or a clear command to bootstrap the cold registry index, it's difficult to provide a definitive solution.

Recommendation

Apply a workaround by trying to manually bootstrap or repair the cold registry index, as it seems to be the most likely cause of the issue. If this doesn't work, try to disable unused bundled extensions to reduce the scan list and improve performance.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING