openclaw - ๐Ÿ’ก(How to fix) Fix [Bug]: Isolated cron runs can wedge gateway

Official PRs (โ€ฆ)
ON THIS PAGE

Recommended Tools

ร—6

Utilities matched from this issueโ€™s tags and category โ€” try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful ยท Quick feedback

Loadingโ€ฆ

After upgrading from OpenClaw 2026.5.18 to 2026.5.27, isolated cron runs appeared to wedge the gateway: systemd still reported the service active, but openclaw status and webchat websocket handshakes timed out.

Error Message

Observed after upgrade from: OpenClaw 2026.5.18 (50a2481)

Observed on: OpenClaw 2026.5.27 (27ae826)

Gateway/service state during failure:

  • openclaw status: gateway reachable=false
  • openclaw status error: timeout
  • systemd user service: openclaw-gateway.service active (running)
  • Gateway bind: 127.0.0.1:18789
  • Webchat websocket handshakes timed out locally
  • Tailscale Serve was still configured/reachable; this did not appear to be a Tailscale failure

Resource usage during failure:

  • Gateway node process consuming about 1 full CPU core
  • RSS around 1.4G
  • systemd stats showed about 8h 49m CPU over 7h 57m wall time
  • Memory peak around 2.2G

Likely involved isolated cron jobs:

  • LAN new device watcher, every 60 seconds
  • arpwatch LAN alert check, every 15 minutes

Recovery: systemctl --user restart openclaw-gateway.service

Then disabled the two cron jobs listed above.

After recovery:

  • Gateway reachable=true
  • Websocket latency about 57ms
  • CPU settled to roughly 0.5-2%
  • RSS around 454M
  • Tailscale Serve/webchat worked again

Root Cause

Affected: Webchat/Tailscale users relying on the local OpenClaw gateway, especially with isolated cron jobs enabled. Severity: High for affected users because it breaks remote chat access while the service still appears active under systemd. Frequency: Observed once after upgrading to 2026.5.27; not yet deterministically reproduced. Consequence: Webchat becomes unreachable until the gateway is manually restarted, and systemd does not automatically recover because the process remains active.

Fix Action

Fix / Workaround

Temporary workaround: restart openclaw-gateway.service, then disable or slow the frequent isolated cron jobs.

Code Example

Observed after upgrade from:
OpenClaw 2026.5.18 (50a2481)

Observed on:
OpenClaw 2026.5.27 (27ae826)

Gateway/service state during failure:
- openclaw status: gateway reachable=false
- openclaw status error: timeout
- systemd user service: openclaw-gateway.service active (running)
- Gateway bind: 127.0.0.1:18789
- Webchat websocket handshakes timed out locally
- Tailscale Serve was still configured/reachable; this did not appear to be a Tailscale failure

Resource usage during failure:
- Gateway node process consuming about 1 full CPU core
- RSS around 1.4G
- systemd stats showed about 8h 49m CPU over 7h 57m wall time
- Memory peak around 2.2G

Likely involved isolated cron jobs:
- LAN new device watcher, every 60 seconds
- arpwatch LAN alert check, every 15 minutes

Recovery:
systemctl --user restart openclaw-gateway.service

Then disabled the two cron jobs listed above.

After recovery:
- Gateway reachable=true
- Websocket latency about 57ms
- CPU settled to roughly 0.5-2%
- RSS around 454M
- Tailscale Serve/webchat worked again
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

After upgrading from OpenClaw 2026.5.18 to 2026.5.27, isolated cron runs appeared to wedge the gateway: systemd still reported the service active, but openclaw status and webchat websocket handshakes timed out.

Steps to reproduce

  1. Run OpenClaw 2026.5.27 with the gateway managed by the systemd user service.
  2. Have isolated cron jobs enabled, including a frequent job such as "LAN new device watcher" every 60 seconds and another such as "arpwatch LAN alert check" every 15 minutes.
  3. Let the system run for several hours after upgrade.
  4. Observe webchat becoming unreachable.
  5. Run openclaw status and observe gateway reachable=false with a timeout while systemd still reports openclaw-gateway.service active.

Expected behavior

The gateway should remain responsive even if an isolated cron run stalls or loops. If a cron child task wedges, OpenClaw should timeout/kill that run, trip a health check, or restart/recover instead of leaving the gateway active but unreachable.

Actual behavior

The gateway process remained active under systemd, but openclaw status timed out, local webchat websocket handshakes timed out, and the gateway node process consumed sustained high CPU and memory until manually restarted.

OpenClaw version

2026.5.27 (27ae826)

Operating system

Ubuntu 26.04

Install method

npm global, upgraded with openclaw update

Model

gpt-5 / Codex via OpenClaw

Provider / routing chain

webchat -> Tailscale Serve -> local OpenClaw gateway on 127.0.0.1:18789 -> Codex/OpenAI

Additional provider/model setup details

The failure did not appear to be provider/model-specific. The observable outage was at the local gateway/webchat layer: Tailscale still routed to the local gateway, but the gateway did not respond fast enough for status/websocket handshakes.

Logs, screenshots, and evidence

Observed after upgrade from:
OpenClaw 2026.5.18 (50a2481)

Observed on:
OpenClaw 2026.5.27 (27ae826)

Gateway/service state during failure:
- openclaw status: gateway reachable=false
- openclaw status error: timeout
- systemd user service: openclaw-gateway.service active (running)
- Gateway bind: 127.0.0.1:18789
- Webchat websocket handshakes timed out locally
- Tailscale Serve was still configured/reachable; this did not appear to be a Tailscale failure

Resource usage during failure:
- Gateway node process consuming about 1 full CPU core
- RSS around 1.4G
- systemd stats showed about 8h 49m CPU over 7h 57m wall time
- Memory peak around 2.2G

Likely involved isolated cron jobs:
- LAN new device watcher, every 60 seconds
- arpwatch LAN alert check, every 15 minutes

Recovery:
systemctl --user restart openclaw-gateway.service

Then disabled the two cron jobs listed above.

After recovery:
- Gateway reachable=true
- Websocket latency about 57ms
- CPU settled to roughly 0.5-2%
- RSS around 454M
- Tailscale Serve/webchat worked again

Impact and severity

Affected: Webchat/Tailscale users relying on the local OpenClaw gateway, especially with isolated cron jobs enabled. Severity: High for affected users because it breaks remote chat access while the service still appears active under systemd. Frequency: Observed once after upgrading to 2026.5.27; not yet deterministically reproduced. Consequence: Webchat becomes unreachable until the gateway is manually restarted, and systemd does not automatically recover because the process remains active.

Additional information

Last known good: 2026.5.18 (50a2481) First observed bad: 2026.5.27 (27ae826)

Temporary workaround: restart openclaw-gateway.service, then disable or slow the frequent isolated cron jobs.

This looks more like a gateway/cron isolation issue than a Tailscale issue. Tailscale still routed to the local gateway, but the gateway was not responding. The gateway should remain responsive or self-recover if a cron child task stalls.

Vote matrix ยท Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loadingโ€ฆ

FAQ

Expected behavior

The gateway should remain responsive even if an isolated cron run stalls or loops. If a cron child task wedges, OpenClaw should timeout/kill that run, trip a health check, or restart/recover instead of leaving the gateway active but unreachable.

Still need to ship something?

ร—6

Another batch ranked right after the header list โ€” different links, same matching logic.

Back to top recommendations

TRENDING