openclaw - 💡(How to fix) Fix v2026.5.22: stale worker accumulation (#76171) escalated to complete memory exhaustion within 2–3 min of startup

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

This is a regression severity report. The root cause is the same (missing cleanup flags in the isolated cron CLI branch of run-executor.ts, tracked in #76171 since v2026.4.29), but the manifestation in v2026.5.22 is categorically worse — moving from occasional background accumulation to complete memory exhaustion and irrecoverable gateway stall within minutes of startup.

Fix Action

Fix / Workaround

Workaround (in use)

Code Example

$ ps aux | grep openclaw | grep " R " | grep -v gateway | wc -l
26

$ vm_stat | head -3
Mach Virtual Memory Statistics: (page size of 16384 bytes)
Pages free: 4531.      # ~74MB — critical
Pages active: 216409.

---

liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu
  eventLoopDelayP99Ms=110192.8
  eventLoopDelayMaxMs=110192.8
  eventLoopUtilization=1
  work=[active=agent:main:main(processing/embedded_run,q=1,age=137s last=embedded_run:started)]

stalled session: sessionKey=agent:main:main state=processing age=137s
  reason=active_work_without_progress
  classification=stalled_agent_run
  recovery=none

fetch timeout reached; aborting operation
RAW_BUFFERClick to expand / collapse

This is not a duplicate of #76171

This is a regression severity report. The root cause is the same (missing cleanup flags in the isolated cron CLI branch of run-executor.ts, tracked in #76171 since v2026.4.29), but the manifestation in v2026.5.22 is categorically worse — moving from occasional background accumulation to complete memory exhaustion and irrecoverable gateway stall within minutes of startup.

Fix PR: #73919 — currently open, conflicting, bot-generated (openclaw-clownfish). No maintainer activity on #76171 in 22 days.


Severity change across versions

VersionBehaviorGateway
v2026.5.121–2 stale R-state workers after restartEvent loop recoverable
v2026.5.2220–60+ R-state workers within 2–3 minrecovery=none, fully unresponsive

Note: I cannot confirm whether v2026.5.12 changes caused the escalation — this is a correlation from personal observation logs, not a controlled test.

Environment

  • OS: macOS 15.3, Apple Silicon (Mac mini M4, 16GB)
  • OpenClaw: 2026.5.22 (a374c3a)
  • Node.js: v22.22.0
  • Gateway: LaunchAgent, loopback only
  • Active agents: main (heartbeat 300s), research, obsidian, obsidian-write, gws-read
  • Cron jobs: several

Reproduction

Within 2–3 minutes of openclaw gateway start:

$ ps aux | grep openclaw | grep " R " | grep -v gateway | wc -l
26

$ vm_stat | head -3
Mach Virtual Memory Statistics: (page size of 16384 bytes)
Pages free: 4531.      # ~74MB — critical
Pages active: 216409.

Gateway log at exhaustion point:

liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu
  eventLoopDelayP99Ms=110192.8
  eventLoopDelayMaxMs=110192.8
  eventLoopUtilization=1
  work=[active=agent:main:main(processing/embedded_run,q=1,age=137s last=embedded_run:started)]

stalled session: sessionKey=agent:main:main state=processing age=137s
  reason=active_work_without_progress
  classification=stalled_agent_run
  recovery=none

fetch timeout reached; aborting operation

Gateway is fully unresponsive at this point. No recovery without manual process kill + restart.

Workaround (in use)

Killing R-state workers manually restores memory. A launchd script running every 2 min kills workers older than 3 min as a stopgap. Full script in #76171 comment. This masks the symptom — it does not fix the root cause.

What I'm asking

  1. Does the fix in #73919 also address the v2026.5.22 escalation, or is there a separate regression?
  2. Can the conflicts in #73919 be resolved so it can be reviewed and merged?
  3. If #73919 is insufficient, is there a known v2026.5.22 change that could explain the rate increase?

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix v2026.5.22: stale worker accumulation (#76171) escalated to complete memory exhaustion within 2–3 min of startup