openclaw - 💡(How to fix) Fix [Bug]: Gateway main thread CPU-bound at ~100% on v2026.4.26 / current main; clean on v2026.4.22 (fs.stat storm in microtask queue) [5 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#74328Fetched 2026-04-30 06:25:26
View on GitHub
Comments
5
Participants
4
Timeline
11
Reactions
4
Author
Timeline (top)
commented ×5subscribed ×3cross-referenced ×2renamed ×1

After upgrading from v2026.4.22 to v2026.4.26 (also reproduces on current main at 9bb1e59, which package.json reports as 2026.4.27), the gateway sits at ~100% CPU on its single main thread and stops responding to local probes. Same host, same ~/.openclaw, no config changes — only git checkout differs.

I've seen #74209 and the regression range overlaps, but on my machine the dominant signal in a CPU sample is a fs.stat storm in the JS microtask queue rather than bonjour. Filing separately in case the maintainer wants to triage as the same root cause or as a sibling regression.

Root Cause

I've seen #74209 and the regression range overlaps, but on my machine the dominant signal in a CPU sample is a fs.stat storm in the JS microtask queue rather than bonjour. Filing separately in case the maintainer wants to triage as the same root cause or as a sibling regression.

Fix Action

Workaround

Pin to v2026.4.22. Disabling the OpenClaw Auto-Update cron is recommended so the upgrade doesn't reapply.

Code Example

git checkout v2026.4.26 && pnpm install && pnpm build
openclaw gateway restart
sleep 30
ps -o stat,%cpu,etime $(pgrep -f 'dist/index.js gateway')
# → R  95-100  sustained

git checkout v2026.4.22 && pnpm install && pnpm build
openclaw gateway restart
sleep 30
ps -o stat,%cpu,etime $(pgrep -f 'dist/index.js gateway')
# → S  2.6

---

[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization
  interval=320s eventLoopDelayP99Ms=314874.8 eventLoopDelayMaxMs=314874.8
  eventLoopUtilization=0.999 cpuCoreRatio=0.585 active=0 waiting=0 queued=0

---

4074 uv__io_poll
+ 4074 uv__async_io
+   4074 uv__work_done
+     4074 MakeLibuvRequestCallback<uv_fs_s>::Wrapper
+       4074 node::fs::AfterStat
+         4074 MicrotaskQueue::PerformCheckpointInternal
+           4074 MicrotaskQueue::RunMicrotasks
+             4074 Builtins_PromiseFulfillReactionJob
+               4074 AsyncFunctionAwaitResolveClosure
+                 4074 <JIT JS, unsymbolicated>
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Summary

After upgrading from v2026.4.22 to v2026.4.26 (also reproduces on current main at 9bb1e59, which package.json reports as 2026.4.27), the gateway sits at ~100% CPU on its single main thread and stops responding to local probes. Same host, same ~/.openclaw, no config changes — only git checkout differs.

I've seen #74209 and the regression range overlaps, but on my machine the dominant signal in a CPU sample is a fs.stat storm in the JS microtask queue rather than bonjour. Filing separately in case the maintainer wants to triage as the same root cause or as a sibling regression.

Versions

  • macOS 26.4 (Mac Studio, ARM64), Node 22 via Homebrew
  • Bad: v2026.4.26 (be8c246), and current main at 9bb1e59 (reports as 2026.4.27)
  • Good: v2026.4.22 (00bd2cf)

Steps to reproduce

Same ~/.openclaw, just switch versions:

git checkout v2026.4.26 && pnpm install && pnpm build
openclaw gateway restart
sleep 30
ps -o stat,%cpu,etime $(pgrep -f 'dist/index.js gateway')
# → R  95-100  sustained

git checkout v2026.4.22 && pnpm install && pnpm build
openclaw gateway restart
sleep 30
ps -o stat,%cpu,etime $(pgrep -f 'dist/index.js gateway')
# → S  2.6

Side-by-side, same host

4.26 / main4.22
ps STAT %CPUR 95-100S 2.6
eventLoopDelayMaxMs (liveness warning)up to 314 866 msnone reported
eventLoopUtilization0.95–1.00<0.10
curl -m 3 http://127.0.0.1:18789/3 s timeout3-9 ms
Discord WS lifetimecloses 1000/zombie every 60-90 sstable
openclaw gateway status"Connectivity probe: failed (timeout)" while runtime is "active"OK, admin-capable

A representative liveness warning on main:

[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization
  interval=320s eventLoopDelayP99Ms=314874.8 eventLoopDelayMaxMs=314874.8
  eventLoopUtilization=0.999 cpuCoreRatio=0.585 active=0 waiting=0 queued=0

i.e. the loop blocked for over 5 minutes with nothing in the active queue.

CPU sample on main (5 s, idle, no incoming traffic)

/usr/bin/sample <gateway-pid> 5 — all 4074 stacks collapse to this single path:

4074 uv__io_poll
+ 4074 uv__async_io
+   4074 uv__work_done
+     4074 MakeLibuvRequestCallback<uv_fs_s>::Wrapper
+       4074 node::fs::AfterStat
+         4074 MicrotaskQueue::PerformCheckpointInternal
+           4074 MicrotaskQueue::RunMicrotasks
+             4074 Builtins_PromiseFulfillReactionJob
+               4074 AsyncFunctionAwaitResolveClosure
+                 4074 <JIT JS, unsymbolicated>

Every sample is in node::fs::AfterStat. The main thread is consumed resolving promises from a flood of fs.stat calls. The same sample on 4.22 with the same config and data spends almost the whole 5 s in kqueue waiting.

Effect on real usage

Channel messages arrive (Discord WS frames are received), the session enters state=processing and ages indefinitely without a reply — I saw agent:main:discord:direct:oliver reach age=1376s queueDepth=1 before the gateway watchdog killed and restarted the process. No subprocess is involved (no docker, no acpx wrapper, no model API call) — the wedge is purely in-JS work on the main thread.

Workaround

Pin to v2026.4.22. Disabling the OpenClaw Auto-Update cron is recommended so the upgrade doesn't reapply.

Related

  • #74209 — overlapping regression range, same restart churn, framed around bundled-plugins / bonjour. Possibly the same root cause from a different angle.

extent analysis

TL;DR

The most likely fix is to pin the version to v2026.4.22 to avoid the fs.stat storm causing high CPU usage on the main thread.

Guidance

  • The issue is likely caused by a regression introduced between v2026.4.22 and v2026.4.26, resulting in an excessive number of fs.stat calls.
  • To verify the issue, run the provided ps and curl commands to check CPU usage and response times.
  • Disabling the OpenClaw Auto-Update cron can prevent the upgrade from reapplying and causing the issue again.
  • Monitoring the eventLoopUtilization and eventLoopDelayMaxMs metrics can help identify similar issues in the future.

Example

No code snippet is provided as the issue is related to a specific version regression.

Notes

The root cause of the issue is not explicitly stated, but it is likely related to the fs.stat storm observed in the CPU sample. The provided workaround is to pin the version to v2026.4.22, which may not be a long-term solution.

Recommendation

Apply the workaround by pinning the version to v2026.4.22 to avoid the high CPU usage issue, as it is a known good version that does not exhibit this behavior.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Gateway main thread CPU-bound at ~100% on v2026.4.26 / current main; clean on v2026.4.22 (fs.stat storm in microtask queue) [5 comments, 4 participants]