openclaw - 💡(How to fix) Fix Update 4.23 → 4.24+ causes gateway paralysis: empty runtime-deps package.json + missing chokidar cascades into full event loop starvation [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73188Fetched 2026-04-29 06:22:22
View on GitHub
Comments
2
Participants
3
Timeline
4
Reactions
0
Timeline (top)
commented ×2closed ×1subscribed ×1

Error Message

Memory broken

qmd memory unavailable; falling back to builtin: Cannot find package 'chokidar'

Active-memory timing out

active-memory ... done status=timeout

WS handshakes failing

closed before connect conn=... code=1006 handshake timeout conn=...

Pricing fetches (blocked event loop, not network)

OpenRouter pricing fetch failed (timeout 60s): TimeoutError: The operation was aborted due to timeout LiteLLM pricing fetch failed (timeout 60s): TimeoutError: The operation was aborted due to timeout

Node probe failures

remote bin probe timed out (Nigel's iMac Pro; command=system.which timeoutMs=15000 requiredBins=45 connected=yes)

Sessions stuck

stuck session: sessionId=main state=processing age=159s queueDepth=0

Embedded runs failing

embedded_run_failover_decision stage=assistant decision=surface_error failoverReason=timeout

Root Cause

Root Cause Analysis

Fix Action

Workaround

Rolled back to 4.23 via:

npm install -g [email protected]
openclaw gateway restart

Gateway immediately stable. All 30 cron jobs, 7 agents, QMD, and memory working normally.

Code Example

{
  "name": "openclaw-runtime-deps-install",
  "private": true
}

---

Cannot find package 'chokidar' imported from .../qmd-manager-FuXCtSYP.js

---

# Memory broken
qmd memory unavailable; falling back to builtin: Cannot find package 'chokidar'

# Active-memory timing out
active-memory ... done status=timeout

# WS handshakes failing
closed before connect conn=... code=1006
handshake timeout conn=...

# Pricing fetches (blocked event loop, not network)
OpenRouter pricing fetch failed (timeout 60s): TimeoutError: The operation was aborted due to timeout
LiteLLM pricing fetch failed (timeout 60s): TimeoutError: The operation was aborted due to timeout

# Node probe failures
remote bin probe timed out (Nigel's iMac Pro; command=system.which timeoutMs=15000 requiredBins=45 connected=yes)

# Sessions stuck
stuck session: sessionId=main state=processing age=159s queueDepth=0

# Embedded runs failing
embedded_run_failover_decision stage=assistant decision=surface_error failoverReason=timeout

---

cd ~/.openclaw/plugin-runtime-deps/openclaw-2026.4.25-17c3fa238f70
npm install chokidar@latest --no-save

---

npm install -g openclaw@2026.4.23
openclaw gateway restart
RAW_BUFFERClick to expand / collapse

Bug Report: Update from 4.23 → 4.24+ causes gateway paralysis on power-user setups

Environment

  • OpenClaw version: 2026.4.23 (stable) → 2026.4.24 / 2026.4.25 (broken)
  • OS: macOS 15.7.3 (Sequoia), Intel x86_64 (iMac Pro)
  • Node.js: v25.9.0
  • npm: 11.12.1
  • Install method: npm install -g openclaw

Setup Profile (power user)

  • 7 configured agents
  • 30 enabled cron jobs (all sessionTarget: "isolated", agentTurn payloads)
  • 144MB QMD semantic search index (22,668 vectors, 2,022 files, 88 memory markdown files)
  • Active-memory plugin enabled
  • Multiple provider configs (Anthropic, OpenAI, Google, OpenRouter disabled)
  • Bonjour + device-pair for macOS node

What Happened

Updated from 4.23 → 4.24 via openclaw update. Gateway failed to restart and never recovered. Later attempted 4.25 — same result. Rolled back to 4.23 — immediately stable.

Root Cause Analysis

1. Runtime-deps install produces empty package.json

After the update, the runtime-deps directory (~/.openclaw/plugin-runtime-deps/openclaw-2026.4.25-17c3fa238f70/package.json) contained:

{
  "name": "openclaw-runtime-deps-install",
  "private": true
}

No dependencies field. The .openclaw-runtime-deps.json manifest lists ~70 specs that should have been installed, but the actual package.json used for npm install was empty. chokidar (required by the QMD file watcher qmd-manager-FuXCtSYP.js) was completely missing from node_modules/.

2. Missing chokidar breaks QMD → memory cascade

Cannot find package 'chokidar' imported from .../qmd-manager-FuXCtSYP.js

QMD falls back to a degraded mode. Memory search calls start timing out (15s default). The active-memory plugin, which queries memory on every direct message, hammers the broken backend repeatedly.

3. 30 cron jobs amplify the failure

Each cron job spawns an isolated session that bootstraps memory/QMD. With QMD broken, every session startup either hangs or times out. 30 jobs firing on schedule = 30 concurrent timeout storms flooding the Node.js event loop.

4. Event loop starvation causes total gateway paralysis

Once the event loop is saturated:

  • WebSocket handshakes time outhandshake timeout (136 occurrences in one day)
  • Pricing fetches time out — OpenRouter + LiteLLM 60s timeouts (even though curl to the same URLs works in <0.2s from the same machine)
  • Remote node bin probes time outsystem.which timeoutMs=15000
  • Bonjour advertiser gets stuck — service stuck in "announcing" for 60s+
  • Discord health-monitor — keeps restarting due to disconnection
  • Sessions marked "stuck"stuck session: state=processing
  • openclaw status itself times out — can't even self-diagnose

The gateway accepts TCP connections and appears "alive" but is functionally brain-dead.

Evidence from Logs

# Memory broken
qmd memory unavailable; falling back to builtin: Cannot find package 'chokidar'

# Active-memory timing out
active-memory ... done status=timeout

# WS handshakes failing
closed before connect conn=... code=1006
handshake timeout conn=...

# Pricing fetches (blocked event loop, not network)
OpenRouter pricing fetch failed (timeout 60s): TimeoutError: The operation was aborted due to timeout
LiteLLM pricing fetch failed (timeout 60s): TimeoutError: The operation was aborted due to timeout

# Node probe failures
remote bin probe timed out (Nigel's iMac Pro; command=system.which timeoutMs=15000 requiredBins=45 connected=yes)

# Sessions stuck
stuck session: sessionId=main state=processing age=159s queueDepth=0

# Embedded runs failing
embedded_run_failover_decision stage=assistant decision=surface_error failoverReason=timeout

Verification

After manually installing chokidar into the runtime-deps folder:

cd ~/.openclaw/plugin-runtime-deps/openclaw-2026.4.25-17c3fa238f70
npm install chokidar@latest --no-save

...QMD-related errors stopped. But the accumulated event loop damage from active-memory + cron jobs meant recovery required a full rollback to 4.23.

Suggestions

  1. Runtime-deps installer should validate its own output. If the package.json has no dependencies after the install step, that's a hard failure — don't let the gateway boot on a broken dep tree.

  2. Missing chokidar should be a loud startup error, not a silent fallback. The current behavior silently degrades QMD, which cascades into everything else.

  3. Consider circuit-breaking for memory/QMD failures. If QMD fails N times on boot, disable memory-dependent features (active-memory, session memory search) rather than letting them timeout-loop indefinitely.

  4. Cron job startup should be gated on core subsystem health. If QMD/memory isn't ready, delay cron job execution rather than spawning 30 sessions that all independently discover the failure.

  5. openclaw status should have its own fast-path that doesn't depend on the event loop being healthy — if the diagnostic tool is also a victim of event loop starvation, the operator has no visibility.

Workaround

Rolled back to 4.23 via:

npm install -g [email protected]
openclaw gateway restart

Gateway immediately stable. All 30 cron jobs, 7 agents, QMD, and memory working normally.

extent analysis

TL;DR

Manually installing chokidar in the runtime-deps folder may temporarily resolve the gateway paralysis issue, but a more comprehensive fix is needed to address the underlying problems with the runtime-deps installer and event loop management.

Guidance

  1. Validate runtime-deps installation: Ensure that the package.json file in the runtime-deps directory contains the expected dependencies after installation.
  2. Implement loud startup errors: Modify the startup process to loudly fail if critical dependencies like chokidar are missing, rather than silently degrading functionality.
  3. Circuit-break for memory/QMD failures: Consider implementing a mechanism to disable memory-dependent features if QMD fails repeatedly, preventing indefinite timeout loops.
  4. Gate cron job startup on core subsystem health: Delay cron job execution until core subsystems like QMD and memory are ready to prevent concurrent timeout storms.
  5. Enhance openclaw status tool: Create a fast-path for the openclaw status command that doesn't rely on the event loop being healthy, providing operators with visibility even when the gateway is under stress.

Example

To manually install chokidar and potentially resolve the issue temporarily:

cd ~/.openclaw/plugin-runtime-deps/openclaw-2026.4.25-17c3fa238f70
npm install chokidar@latest --no-save

Notes

The provided workaround of rolling back to version 4.23 may not be a long-term solution, as it doesn't address the underlying issues. A more comprehensive fix is necessary to ensure the stability and reliability of the gateway.

Recommendation

Apply the workaround by rolling back to version 4.23 until a more comprehensive fix is available, as it immediately stabilizes the gateway and allows for normal operation of all features.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Update 4.23 → 4.24+ causes gateway paralysis: empty runtime-deps package.json + missing chokidar cascades into full event loop starvation [2 comments, 3 participants]