openclaw - 💡(How to fix) Fix [Bug]: Gateway event loop stalls 8–12s every ~30 minutes at idle — active=0, waiting=0, queued=0

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The OpenClaw gateway produces recurring event loop stalls of 8–12 seconds approximately every 30 minutes regardless of load, including when fully idle with zero active sessions, waiting tasks, or queued work.

Root Cause

The OpenClaw gateway produces recurring event loop stalls of 8–12 seconds approximately every 30 minutes regardless of load, including when fully idle with zero active sessions, waiting tasks, or queued work.

Code Example

5-row stall table above. All entries from the same gateway process (PID 232406, uptime ~11 hours at time of capture). The active=0 rows at 00:11, 00:41, and 01:11 are the cleanest signal — no sessions active, stall fires regardless.
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

The OpenClaw gateway produces recurring event loop stalls of 8–12 seconds approximately every 30 minutes regardless of load, including when fully idle with zero active sessions, waiting tasks, or queued work.

Steps to reproduce

  • Run OpenClaw gateway on a single-core Linux VPS (Ubuntu 24.04, Node.js v24)
  • Configure Hindsight memory plugin (hindsight-openclaw, any embedding provider)
  • Leave the gateway idle (no active sessions)
  • Wait 30 minutes — stall fires on the diagnostic interval

Expected behavior

Gateway event loop utilization at idle should be near 0%. eventLoopDelayMaxMs should remain well below the 100ms diagnostic threshold when no work is in flight.

Actual behavior

Event loop stalls of 8,850–12,021ms fire approximately every 30 minutes. ELU runs at 37–50% at idle. The pattern persists with active=0, waiting=0, queued=0 — confirming the stall source is not user traffic or queued work.

Observed stall log (5 consecutive 30-minute intervals, all idle or near-idle):

''' 2026-05-11T23:41:15 [diagnostic] liveness warning: reasons=event_loop_delay interval=30s eventLoopDelayP99Ms=60.6 eventLoopDelayMaxMs=12020.9 eventLoopUtilization=0.501 cpuCoreRatio=0.546 active=1 waiting=0 queued=0 2026-05-12T00:11:15 [diagnostic] liveness warning: reasons=event_loop_delay interval=30s eventLoopDelayP99Ms=56.0 eventLoopDelayMaxMs=10074.7 eventLoopUtilization=0.430 cpuCoreRatio=0.506 active=0 waiting=0 queued=0 2026-05-12T00:41:15 [diagnostic] liveness warning: reasons=event_loop_delay interval=30s eventLoopDelayP99Ms=65.2 eventLoopDelayMaxMs=10678.7 eventLoopUtilization=0.484 cpuCoreRatio=0.566 active=0 waiting=0 queued=0 '''

''' 2026-05-12T01:11:15 [diagnostic] liveness warning: reasons=event_loop_delay interval=30s eventLoopDelayP99Ms=68.1 eventLoopDelayMaxMs=9386.9 eventLoopUtilization=0.437 cpuCoreRatio=0.549 active=0 waiting=0 queued=0 2026-05-12T01:41:15 [diagnostic] liveness warning: reasons=event_loop_delay interval=30s eventLoopDelayP99Ms=65.8 eventLoopDelayMaxMs=9554.6 eventLoopUtilization=0.368 cpuCoreRatio=0.438 active=0 waiting=0 queued=0 '''

User-visible consequence: responses delayed 3+ minutes while message sits in queue during a stall window.

OpenClaw version

2026.4.27 (cbc2ba0)

Operating system

Ubuntu 24.04.4 LTS, Linux 6.8.0-111-generic x86_64

Install method

(/usr/lib/node_modules/openclaw), running as systemd service via openclaw-gateway.service, Node.js v24.14.1

Model

anthropic/claude-sonnet-4-6

Provider / routing chain

Direct Anthropic API via Claude Pro OAuth token. No proxy or intermediate gateway.

Additional provider/model setup details

  • 6-core AMD EPYC 2.0GHz VPS, no GPU -Hindsight memory plugin active (hindsight-openclaw, bank: main, BGE-small local embeddings via Python daemon on localhost:9077)
  • autoRetain: false at time of observation (disabled for testing — stalls present both before and after disabling)
  • recallBudget: mid, recallMaxTokens: 2048

Logs, screenshots, and evidence

5-row stall table above. All entries from the same gateway process (PID 232406, uptime ~11 hours at time of capture). The active=0 rows at 00:11, 00:41, and 01:11 are the cleanest signal — no sessions active, stall fires regardless.

Impact and severity

  • Affected: all users on single-instance deployments with Hindsight plugin active
  • Severity: blocks workflow — user messages queue silently for 3+ minutes during stall windows with no visible indicator
  • Frequency: every ~30 minutes, consistent across multiple hours of observation
  • Consequence: 3+ minute response delays on messages that arrive during a stall window; perceived as unresponsiveness

Additional information

  • Stalls occur with autoRetain both enabled and disabled, ruling out Hindsight retain operations as the cause. Stalls persist even when the Hindsight recall operation itself completes quickly. The 30-minute periodicity matches an internal timer or polling interval in the gateway. Server-side resource exhaustion ruled out: CPU ~97–98% idle, disk I/O negligible, load averages normal at time of stalls.
  • NOT_ENOUGH_INFO on last known good version — issue has been present since at least 2026.4.14 (earliest version in use on this instance); cannot confirm earlier versions were clean.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Gateway event loop utilization at idle should be near 0%. eventLoopDelayMaxMs should remain well below the 100ms diagnostic threshold when no work is in flight.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING