openclaw - 💡(How to fix) Fix v2026.5.22: stale worker accumulation (#76171) escalated to complete memory exhaustion within 2–3 min of startup

openclaw2026-05-24 13:08:12

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Root Cause

This is a regression severity report. The root cause is the same (missing cleanup flags in the isolated cron CLI branch of run-executor.ts, tracked in #76171 since v2026.4.29), but the manifestation in v2026.5.22 is categorically worse — moving from occasional background accumulation to complete memory exhaustion and irrecoverable gateway stall within minutes of startup.

Fix Action

Fix / Workaround

Workaround (in use)

Code Example

$ ps aux | grep openclaw | grep " R " | grep -v gateway | wc -l
26

$ vm_stat | head -3
Mach Virtual Memory Statistics: (page size of 16384 bytes)
Pages free: 4531.      # ~74MB — critical
Pages active: 216409.

---

liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu
  eventLoopDelayP99Ms=110192.8
  eventLoopDelayMaxMs=110192.8
  eventLoopUtilization=1
  work=[active=agent:main:main(processing/embedded_run,q=1,age=137s last=embedded_run:started)]

stalled session: sessionKey=agent:main:main state=processing age=137s
  reason=active_work_without_progress
  classification=stalled_agent_run
  recovery=none

fetch timeout reached; aborting operation

RAW_BUFFERClick to expand / collapse

This is not a duplicate of #76171

Fix PR: #73919 — currently open, conflicting, bot-generated (openclaw-clownfish). No maintainer activity on #76171 in 22 days.

Severity change across versions

Version	Behavior	Gateway
v2026.5.12	1–2 stale R-state workers after restart	Event loop recoverable
v2026.5.22	20–60+ R-state workers within 2–3 min	`recovery=none`, fully unresponsive

Note: I cannot confirm whether v2026.5.12 changes caused the escalation — this is a correlation from personal observation logs, not a controlled test.

Environment

OS: macOS 15.3, Apple Silicon (Mac mini M4, 16GB)
OpenClaw: 2026.5.22 (a374c3a)
Node.js: v22.22.0
Gateway: LaunchAgent, loopback only
Active agents: main (heartbeat 300s), research, obsidian, obsidian-write, gws-read
Cron jobs: several

Reproduction

Within 2–3 minutes of openclaw gateway start:

$ ps aux | grep openclaw | grep " R " | grep -v gateway | wc -l
26

$ vm_stat | head -3
Mach Virtual Memory Statistics: (page size of 16384 bytes)
Pages free: 4531.      # ~74MB — critical
Pages active: 216409.

Gateway log at exhaustion point:

liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu
  eventLoopDelayP99Ms=110192.8
  eventLoopDelayMaxMs=110192.8
  eventLoopUtilization=1
  work=[active=agent:main:main(processing/embedded_run,q=1,age=137s last=embedded_run:started)]

stalled session: sessionKey=agent:main:main state=processing age=137s
  reason=active_work_without_progress
  classification=stalled_agent_run
  recovery=none

fetch timeout reached; aborting operation

Gateway is fully unresponsive at this point. No recovery without manual process kill + restart.

Workaround (in use)

Killing R-state workers manually restores memory. A launchd script running every 2 min kills workers older than 3 min as a stopgap. Full script in #76171 comment. This masks the symptom — it does not fix the root cause.

What I'm asking

Does the fix in #73919 also address the v2026.5.22 escalation, or is there a separate regression?
Can the conflicts in #73919 be resolved so it can be reviewed and merged?
If #73919 is insufficient, is there a known v2026.5.22 change that could explain the rate increase?

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix v2026.5.22: stale worker accumulation (#76171) escalated to complete memory exhaustion within 2–3 min of startup

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workaround (in use)

Code Example

This is not a duplicate of #76171

Severity change across versions

Environment

Reproduction

Workaround (in use)

What I'm asking

Still need to ship something?

TRENDING