openclaw - 💡(How to fix) Fix [Bug] v2026.4.8 — openclaw cron list, openclaw message …, openclaw channels list enter 99% CPU busy-wait and never return [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63249Fetched 2026-04-09 07:56:19
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

On ghcr.io/openclaw/openclaw:2026.4.8, several CLI subcommands enter a CPU hot loop (99% CPU, futex_wait_queue) and never produce output, even when invoked against a fully-healthy gateway. The same gateway responds correctly to other commands and to direct HTTP probes.

This is independent of the bonjour CPU loop reported separately — it reproduces with OPENCLAW_DISABLE_BONJOUR=1 set.

Root Cause

On ghcr.io/openclaw/openclaw:2026.4.8, several CLI subcommands enter a CPU hot loop (99% CPU, futex_wait_queue) and never produce output, even when invoked against a fully-healthy gateway. The same gateway responds correctly to other commands and to direct HTTP probes.

This is independent of the bonjour CPU loop reported separately — it reproduces with OPENCLAW_DISABLE_BONJOUR=1 set.

Fix Action

Fix / Workaround

These broken commands directly block our automated infrastructure:

  • openclaw cron list — used by monitoring scripts and the gateway-watchdog cron
  • openclaw message send — used by email_triage_dispatcher.py to post triage results to Discord via subprocess.run. With v4.8, this dispatcher would silently hang every cycle
WorkaroundResult
OPENCLAW_DISABLE_BONJOUR=1 (rules out the bonjour issue)No change
OPENCLAW_NO_RESPAWN=1 (per doctor's startup-optimization hint)No change
docker exec -t (allocate TTY)No change
--json output flagNo change
Explicit OPENCLAW_GATEWAY_TOKEN matching gatewayNo change (also caused unrelated unauthorized client=cli errors when mismatched)
Run after killing existing CLI subprocessesSame hang on next invocation
  1. Acknowledge the regression — confirm whether this affects all Linux x86_64 v4.5+ installs or is environment-specific
  2. Workaround guidance while a fix is in flight — is there a CLI flag, env var, or alternative invocation path that bypasses the broken initialization step?
  3. Roadmap signal — is this on the v4.9 fix list?

Code Example

$ docker exec <container> openclaw --version
OpenClaw 2026.4.8
$ docker exec <container> openclaw doctor      # full output, exit 0
$ docker exec <container> openclaw health      # full output, exit 0, queries gateway via WS RPC successfully
$ docker exec <container> openclaw status --json   # full JSON returned
$ docker exec <container> curl -s http://127.0.0.1:18789/healthz
{"ok":true,"status":"live"}

---

[ws] ⇄ res ✓ channels.status 85ms conn=af5ed2d0…cbeb id=27c7f66f…4a1a
[ws] ⇄ res ✓ sessions.list 71ms conn=3f4b76f5…3e5c id=2e43828e…f947

---

$ timeout 30 docker exec <container> openclaw cron list
# No output. Exit code 124 (timeout).
$ timeout 30 docker exec <container> openclaw message --help
# No output. Exit code 124.
$ timeout 30 docker exec <container> openclaw message send --help
# No output. Exit code 124.
$ timeout 30 docker exec <container> openclaw channels list
# No output. Exit code 124.

---

PID    PPID   C    CMD
180    174    0    openclaw                 ← parent CLI invocation
202    180    64   openclaw-cron            ← subcommand subprocess, 64-99% CPU constantly

---

# 1. Pull v4.8
docker pull ghcr.io/openclaw/openclaw:2026.4.8

# 2. Stage minimal state (any healthy v4.x state works)
mkdir -p /tmp/openclaw-test-state
rsync -a --exclude='cron/jobs.json' --exclude='cron/runs' \
  ~/.openclaw/ /tmp/openclaw-test-state/

# 3. Disable channels in test state to avoid prod conflicts
python3 -c "
import json
p = '/tmp/openclaw-test-state/openclaw.json'
c = json.load(open(p))
for n in ('whatsapp','telegram','discord'):
    if n in c.get('channels',{}):
        c['channels'][n]['enabled'] = False
json.dump(c, open(p,'w'), indent=2)
"
chown -R 1000:1000 /tmp/openclaw-test-state

# 4. Boot
docker run -d --name openclaw-test \
  -e OPENCLAW_DISABLE_BONJOUR=1 \
  -e OPENCLAW_ALLOW_INSECURE_PRIVATE_WS=1 \
  -p 28789:18789 \
  -v /tmp/openclaw-test-state:/home/node/.openclaw \
  ghcr.io/openclaw/openclaw:2026.4.8

# 5. Wait for ready
sleep 60

# 6. Confirm gateway healthy
docker exec openclaw-test openclaw health      # works
docker exec openclaw-test openclaw status --json   # works

# 7. Reproduce the hang
timeout 30 docker exec openclaw-test openclaw cron list
echo "exit code: $?"   # Will print 124
RAW_BUFFERClick to expand / collapse

Title

[Bug] v2026.4.8 — openclaw cron list, openclaw message …, openclaw channels list enter 99% CPU busy-wait and never return

Body

Summary

On ghcr.io/openclaw/openclaw:2026.4.8, several CLI subcommands enter a CPU hot loop (99% CPU, futex_wait_queue) and never produce output, even when invoked against a fully-healthy gateway. The same gateway responds correctly to other commands and to direct HTTP probes.

This is independent of the bonjour CPU loop reported separately — it reproduces with OPENCLAW_DISABLE_BONJOUR=1 set.

Environment

  • Image: ghcr.io/openclaw/openclaw:2026.4.8 (digest sha256:cfd0d7fbd222bf570f9b19293b70d161b046da05a25651c2bef6146935fb9209)
  • Node: v24.14.0 (image-bundled)
  • Host: Linux VPS, Docker bridge network
  • Env: OPENCLAW_DISABLE_BONJOUR=1, OPENCLAW_ALLOW_INSECURE_PRIVATE_WS=1, OPENCLAW_BUNDLED_PLUGINS_DIR=/app/extensions
  • Mounted state: /home/node/.openclaw (rsynced from a healthy v4.2 prod with cron/jobs.json excluded, all channels disabled, no OPENCLAW_GATEWAY_TOKEN override)

What works (gateway is healthy)

$ docker exec <container> openclaw --version
OpenClaw 2026.4.8
$ docker exec <container> openclaw doctor      # full output, exit 0
$ docker exec <container> openclaw health      # full output, exit 0, queries gateway via WS RPC successfully
$ docker exec <container> openclaw status --json   # full JSON returned
$ docker exec <container> curl -s http://127.0.0.1:18789/healthz
{"ok":true,"status":"live"}

Gateway logs confirm the WS RPC plane is responsive:

[ws] ⇄ res ✓ channels.status 85ms conn=af5ed2d0…cbeb id=27c7f66f…4a1a
[ws] ⇄ res ✓ sessions.list 71ms conn=3f4b76f5…3e5c id=2e43828e…f947

What hangs (CLI subcommand hot loops)

$ timeout 30 docker exec <container> openclaw cron list
# No output. Exit code 124 (timeout).
$ timeout 30 docker exec <container> openclaw message --help
# No output. Exit code 124.
$ timeout 30 docker exec <container> openclaw message send --help
# No output. Exit code 124.
$ timeout 30 docker exec <container> openclaw channels list
# No output. Exit code 124.

docker exec -t (TTY allocation) does not change the behavior. OPENCLAW_NO_RESPAWN=1 does not change the behavior. --json output flag does not help (still no output).

Process state during hang

docker top <container> while one of these commands is running:

PID    PPID   C    CMD
180    174    0    openclaw                 ← parent CLI invocation
202    180    64   openclaw-cron            ← subcommand subprocess, 64-99% CPU constantly

The child subprocess (openclaw-cron, openclaw-doctor are seen here too on older runs) is in futex_wait_queue per /proc/<pid>/wchan but docker top's C column shows 60–99% accumulated CPU — consistent with a tight Node.js event loop spinning on a never-resolved promise or busy-polling a condition.

Affects

These broken commands directly block our automated infrastructure:

  • openclaw cron list — used by monitoring scripts and the gateway-watchdog cron
  • openclaw message send — used by email_triage_dispatcher.py to post triage results to Discord via subprocess.run. With v4.8, this dispatcher would silently hang every cycle

These commands work correctly on ghcr.io/openclaw/openclaw:2026.4.2, so this is a v4.5+ regression.

What we tried

WorkaroundResult
OPENCLAW_DISABLE_BONJOUR=1 (rules out the bonjour issue)No change
OPENCLAW_NO_RESPAWN=1 (per doctor's startup-optimization hint)No change
docker exec -t (allocate TTY)No change
--json output flagNo change
Explicit OPENCLAW_GATEWAY_TOKEN matching gatewayNo change (also caused unrelated unauthorized client=cli errors when mismatched)
Run after killing existing CLI subprocessesSame hang on next invocation

Hypothesis

The CLI subcommand router for the cron, message, and channels command trees seems to attempt some initialization step (plugin auto-load? gateway capability negotiation? bundled-channel sidecar import?) that never resolves, while the agent/doctor/health/status/--version commands take a different code path that works fine.

The v4.7 changelog mentions "Hardened Windows file:// paths and resolved native-Jiti issues preventing onboarding and plugin installation" — it's possible the same Jiti/plugin loader path on Linux has a separate hang issue for these subcommand trees.

Reproduction script

# 1. Pull v4.8
docker pull ghcr.io/openclaw/openclaw:2026.4.8

# 2. Stage minimal state (any healthy v4.x state works)
mkdir -p /tmp/openclaw-test-state
rsync -a --exclude='cron/jobs.json' --exclude='cron/runs' \
  ~/.openclaw/ /tmp/openclaw-test-state/

# 3. Disable channels in test state to avoid prod conflicts
python3 -c "
import json
p = '/tmp/openclaw-test-state/openclaw.json'
c = json.load(open(p))
for n in ('whatsapp','telegram','discord'):
    if n in c.get('channels',{}):
        c['channels'][n]['enabled'] = False
json.dump(c, open(p,'w'), indent=2)
"
chown -R 1000:1000 /tmp/openclaw-test-state

# 4. Boot
docker run -d --name openclaw-test \
  -e OPENCLAW_DISABLE_BONJOUR=1 \
  -e OPENCLAW_ALLOW_INSECURE_PRIVATE_WS=1 \
  -p 28789:18789 \
  -v /tmp/openclaw-test-state:/home/node/.openclaw \
  ghcr.io/openclaw/openclaw:2026.4.8

# 5. Wait for ready
sleep 60

# 6. Confirm gateway healthy
docker exec openclaw-test openclaw health      # works
docker exec openclaw-test openclaw status --json   # works

# 7. Reproduce the hang
timeout 30 docker exec openclaw-test openclaw cron list
echo "exit code: $?"   # Will print 124

Asks

  1. Acknowledge the regression — confirm whether this affects all Linux x86_64 v4.5+ installs or is environment-specific
  2. Workaround guidance while a fix is in flight — is there a CLI flag, env var, or alternative invocation path that bypasses the broken initialization step?
  3. Roadmap signal — is this on the v4.9 fix list?

In the meantime our workaround is to stay pinned on v2026.4.2.

extent analysis

TL;DR

The most likely fix or workaround for the CPU hot loop issue in OpenClaw v2026.4.8 is to downgrade to v2026.4.2, as the issue is identified as a regression introduced in v4.5+.

Guidance

  • Verify the issue is not specific to the environment by attempting to reproduce it on a different Linux x86_64 setup with v2026.4.8.
  • Investigate the openclaw command's initialization steps for cron, message, and channels subcommands to identify the specific point of failure.
  • Consider setting up a debug logging mechanism to capture more detailed output during the execution of the problematic subcommands.
  • If a workaround is needed, explore alternative invocation paths or CLI flags that might bypass the broken initialization step.

Example

No specific code snippet can be provided without further information on the internal workings of the openclaw command. However, the reproduction script provided in the issue can be used as a starting point for debugging.

Notes

The issue seems to be related to a regression introduced in v4.5+, and downgrading to v2026.4.2 is currently the only known workaround. The root cause of the issue is still unknown and requires further investigation.

Recommendation

Apply the workaround by downgrading to v2026.4.2, as it is the only known version that does not exhibit the CPU hot loop issue. This will allow for continued functionality while a fix for the regression is developed.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING