openclaw - 💡(How to fix) Fix [Bug]: Gateway scheduler keeps work queued while CPU is saturated [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

During a 60s gateway CPU sample, CPU averaged 83.66% with 42 of 60 samples at or above 100%, while diagnostic heartbeats still showed active and queued work (active=1 queued=3).

Root Cause

During a 60s gateway CPU sample, CPU averaged 83.66% with 42 of 60 samples at or above 100%, while diagnostic heartbeats still showed active and queued work (active=1 queued=3).

Fix Action

Fixed

Code Example

pidstat summary:
rows=60
avg_cpu=83.66
avg_usr=79.42
avg_sys=4.25
cpu_ge_90_count=43
cpu_ge_100_count=42
max_cpu=190.0 at 05:07:47
avg_rd=0.00
avg_wr=90.16

Selected raw pidstat rows:
05:07:11 1000 <gateway-pid> 100.00 0.00 0.00 0.00 100.00 2 0.00 0.00 21352908 1381816 4.20 0.00 0.00 0.00 0 openclaw
05:07:40 1000 <gateway-pid> 100.00 0.00 0.00 0.00 100.00 3 0.00 0.00 21359092 1387828 4.22 0.00 0.00 0.00 0 openclaw
05:07:47 1000 <gateway-pid> 177.00 13.00 0.00 0.00 190.00 3 7079.00 0.00 21237004 1265796 3.85 0.00 8.00 0.00 0 openclaw
05:08:10 1000 <gateway-pid> 101.00 0.00 0.00 0.00 101.00 15 10.00 0.00 21237004 1265884 3.85 0.00 0.00 0.00 0 openclaw

Gateway log correlation:
2026-05-21T05:04:17.715+00:00 diagnostic heartbeat: webhooks=0/0/0 active=2 waiting=0 queued=5
2026-05-21T05:07:27.790+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3
2026-05-21T05:07:47.187+00:00 agent/embedded embedded run tool start: runId=[redacted run id] tool=exec toolCallId=[redacted tool call id]
2026-05-21T05:08:04.306+00:00 fetch-timeout timeoutMs=10000 elapsedMs=17175 timerDelayMs=7175 eventLoopDelayHint="timer delayed 7175ms, likely event-loop starvation"
2026-05-21T05:08:04.316+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3
2026-05-21T05:08:41.295+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

During a 60s gateway CPU sample, CPU averaged 83.66% with 42 of 60 samples at or above 100%, while diagnostic heartbeats still showed active and queued work (active=1 queued=3).

Steps to reproduce

  1. Start OpenClaw 2026.5.20 in a gateway development session.
  2. Run an agent/tool workload that keeps the gateway process CPU-bound.
  3. Capture pidstat -h -u -r -d -p <gateway-pid> 1 60 and correlate it with gateway diagnostic heartbeats.
  4. Observe sustained CPU saturation while gateway heartbeats continue to report active and queued work.

Expected behavior

Under sustained CPU/event-loop pressure with queued work, lower-priority session mirror events should back off while terminal and run-scoped tool events still flow so user-visible tool cards can complete.

Actual behavior

The observed sample kept reporting active and queued work during CPU saturation, with no visible lower-priority event backoff in the correlated diagnostic window.

OpenClaw version

2026.5.20

Operating system

Linux (pidstat capture)

Install method

pnpm dev

Model

NOT_ENOUGH_INFO

Provider / routing chain

NOT_ENOUGH_INFO

Additional provider/model setup details

NOT_ENOUGH_INFO

Logs, screenshots, and evidence

pidstat summary:
rows=60
avg_cpu=83.66
avg_usr=79.42
avg_sys=4.25
cpu_ge_90_count=43
cpu_ge_100_count=42
max_cpu=190.0 at 05:07:47
avg_rd=0.00
avg_wr=90.16

Selected raw pidstat rows:
05:07:11 1000 <gateway-pid> 100.00 0.00 0.00 0.00 100.00 2 0.00 0.00 21352908 1381816 4.20 0.00 0.00 0.00 0 openclaw
05:07:40 1000 <gateway-pid> 100.00 0.00 0.00 0.00 100.00 3 0.00 0.00 21359092 1387828 4.22 0.00 0.00 0.00 0 openclaw
05:07:47 1000 <gateway-pid> 177.00 13.00 0.00 0.00 190.00 3 7079.00 0.00 21237004 1265796 3.85 0.00 8.00 0.00 0 openclaw
05:08:10 1000 <gateway-pid> 101.00 0.00 0.00 0.00 101.00 15 10.00 0.00 21237004 1265884 3.85 0.00 0.00 0.00 0 openclaw

Gateway log correlation:
2026-05-21T05:04:17.715+00:00 diagnostic heartbeat: webhooks=0/0/0 active=2 waiting=0 queued=5
2026-05-21T05:07:27.790+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3
2026-05-21T05:07:47.187+00:00 agent/embedded embedded run tool start: runId=[redacted run id] tool=exec toolCallId=[redacted tool call id]
2026-05-21T05:08:04.306+00:00 fetch-timeout timeoutMs=10000 elapsedMs=17175 timerDelayMs=7175 eventLoopDelayHint="timer delayed 7175ms, likely event-loop starvation"
2026-05-21T05:08:04.316+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3
2026-05-21T05:08:41.295+00:00 diagnostic heartbeat: webhooks=0/0/0 active=1 waiting=0 queued=3

Impact and severity

Affected: gateway sessions under CPU-heavy agent/tool workloads. Severity: Medium-high; queued work can remain behind CPU-heavy active work while low-priority event streams continue. Frequency: Observed in the captured 60s CPU sample. Consequence: Chat latency increases under load, and timer-based checks can drift during event-loop pressure.

Additional information

The fix should preserve terminal tool events so UI tool cards do not remain stale while backing off lower-priority session mirror traffic during diagnostic queue pressure.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Under sustained CPU/event-loop pressure with queued work, lower-priority session mirror events should back off while terminal and run-scoped tool events still flow so user-visible tool cards can complete.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Gateway scheduler keeps work queued while CPU is saturated [1 pull requests]