- `openclaw gateway restart` should succeed when the restarted local gateway is already healthy enough to reject unauthenticated loopback probes - Scheduled Task runtime and port ownership should stay consistent - Cron startup should not preserve impossible stale running state - Once the gateway logs `ready (...)`, `/health` and `/` should remain responsive instead of later hanging or disappearing

openclaw - 💡(How to fix) Fix [Bug]: Windows Scheduled Task gateway restart/health becomes inconsistent after ready [1 participants]

openclaw2026-04-09 02:15:25

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#63491•Fetched 2026-04-09 07:53:09

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ufomaker

Participants

ufomaker

On Windows with the gateway installed as a Scheduled Task, openclaw gateway restart can repeatedly time out with:

Timed out after 60s waiting for gateway port 18789 to become healthy
Service runtime: status=unknown
Port 18789 is already in use

This environment appears to hit more than one problem at once:

A known local loopback probe false negative on restart (ws ... code=1008 reason=connect failed / device-required)
Cron/job/session state corruption after restart (runningAtMs / stale cron session state)
An additional post-ready instability where the gateway can log ready (...) and even bind 18789, but /health and / later stop responding or the port becomes free again

I am filing this because the first two have close neighbors in existing issues/PRs, but I have not found a single Windows issue that covers the full combined behavior end-to-end.

Root Cause

I am filing this because the first two have close neighbors in existing issues/PRs, but I have not found a single Windows issue that covers the full combined behavior end-to-end.

Fix Action

Fix / Workaround

I also locally patched the CLI to treat loopback HTTP /health and local 1008 policy closes as healthy enough for restart probing, which reduced one class of false negatives, but did not eliminate the post-ready instability.

Code Example

2026-04-09T09:55:37.924+08:00 [gateway/ws] closed before connect ... code=1008 reason=connect failed
2026-04-09T10:02:48.014+08:00 Timed out after 60s waiting for gateway port 18789 to become healthy.
2026-04-09T10:02:48.045+08:00 Service runtime: status=unknown
2026-04-09T10:02:48.049+08:00 Gateway port 18789 status: free.
2026-04-09T10:05:23.293+08:00 [gateway] ready (0 plugins, 27.5s)
2026-04-09T10:05:28.021+08:00 [cron] cron: started

RAW_BUFFERClick to expand / collapse

[Bug]: Windows Scheduled Task gateway restart/health becomes inconsistent after ready; mixes known probe false negatives with cron/session stale state and post-ready HTTP loss

Bug type

Regression (worked before, now fails)

Beta release blocker

Summary

On Windows with the gateway installed as a Scheduled Task, openclaw gateway restart can repeatedly time out with:

Timed out after 60s waiting for gateway port 18789 to become healthy
Service runtime: status=unknown
Port 18789 is already in use

This environment appears to hit more than one problem at once:

A known local loopback probe false negative on restart (ws ... code=1008 reason=connect failed / device-required)
Cron/job/session state corruption after restart (runningAtMs / stale cron session state)
An additional post-ready instability where the gateway can log ready (...) and even bind 18789, but /health and / later stop responding or the port becomes free again

I am filing this because the first two have close neighbors in existing issues/PRs, but I have not found a single Windows issue that covers the full combined behavior end-to-end.

OpenClaw version

OpenClaw 2026.4.8 (9ece252)

Operating system

Windows (PowerShell 5.1.22621.4249)

Install method

npm global install + openclaw gateway install / Scheduled Task

Model

bailian/qwen3.5-plus

Provider / routing chain

Ali / Bailian

Additional provider/model setup details

Node.js upgraded to v22.22.2
Repro observed both before and after upgrade from 2026.4.5 to 2026.4.8
Repro observed with normal config and also with external channels/providers largely disabled during bisecting

Steps to reproduce

Install/run gateway on Windows via Scheduled Task
Have existing cron jobs in ~/.openclaw/cron/jobs.json
Run openclaw gateway restart
Observe one or more of the following sequences:

Sequence A:

CLI waits 60s and prints timeout
log shows local WS probe closed with 1008 / connect failed
gateway may actually already be alive

Sequence B:

gateway reaches:
- starting HTTP server...
- ready (... plugins, ...s)
- cron: started
but http://127.0.0.1:18789/health and / later time out or the port becomes free again

Sequence C:

cron jobs recover from UI edits/restart into stale state
previously seen local failures included TypeError: Cannot read properties of undefined (reading 'runningAtMs')
stale runningAtMs / stale cron session state prevented clean recovery without manual intervention

Expected behavior

openclaw gateway restart should succeed when the restarted local gateway is already healthy enough to reject unauthenticated loopback probes
Scheduled Task runtime and port ownership should stay consistent
Cron startup should not preserve impossible stale running state
Once the gateway logs ready (...), /health and / should remain responsive instead of later hanging or disappearing

Actual behavior

Observed across repeated runs on 2026-04-08 and 2026-04-09:

openclaw gateway restart times out after 60s
logs show loopback WS probe closure:
- code=1008 reason=connect failed
- cause":"device-required"
sometimes port 18789 is reported busy while runtime status is unknown
sometimes gateway logs ready (...) and later port 18789 becomes free again
sometimes /health is briefly reachable, then later times out
cron previously failed with missing or stale runningAtMs-related state

Representative log lines:

2026-04-09T09:55:37.924+08:00 [gateway/ws] closed before connect ... code=1008 reason=connect failed
2026-04-09T10:02:48.014+08:00 Timed out after 60s waiting for gateway port 18789 to become healthy.
2026-04-09T10:02:48.045+08:00 Service runtime: status=unknown
2026-04-09T10:02:48.049+08:00 Gateway port 18789 status: free.
2026-04-09T10:05:23.293+08:00 [gateway] ready (0 plugins, 27.5s)
2026-04-09T10:05:28.021+08:00 [cron] cron: started

Related issues / likely overlap

#48771 and PR #48801: Windows/local restart false negative when loopback WS probe is closed with 1008 / connect failed / device required
#44920: stale cron runningAtMs after restart
#59511: local http://127.0.0.1:18789/health not usable after gateway run
#60295: different OS, but similar “restart times out while service state/port ownership is inconsistent”

What I found during local debugging

I did substantial local debugging because the machine was stuck in production use:

upgraded OpenClaw from 2026.4.5 to 2026.4.8
upgraded Node.js to 22.22.2
isolated/remediated several local issues:
- old incompatible channel config fields after upgrade
- untracked local plugin auto-loading
- stale cron job/session state
after that cleanup, the remaining issue was still reproducible:
- gateway reaches ready (...)
- HTTP health/UI later become unreachable or unstable

That suggests there may still be a deeper Windows gateway/runtime bug after startup, beyond the already-known restart probe issue.

Impact and severity

High for Windows users relying on Scheduled Task mode:

restart automation becomes unreliable
control UI availability becomes inconsistent
cron jobs can be left in broken/stale state after restart cycles
users may see a mixture of “service is up”, “service is unknown”, and “port is free” across the same debugging session

Logs, screenshots, and evidence

I can provide:

full openclaw-2026-04-08.log / openclaw-2026-04-09.log
openclaw gateway restart terminal output
openclaw gateway status --json output from both healthy and unhealthy moments
details of the stale cron/session state observed in ~/.openclaw/cron/jobs.json and session index cleanup

Additional information

If helpful, I can also open a follow-up issue with a narrower repro focused only on:

Windows Scheduled Task + restart probe false negative
Cron stale runningAtMs / session state after restart
Post-ready HTTP hang / port disappearance

because on this machine they appeared stacked together.

extent analysis

TL;DR

The most likely fix for the inconsistent Windows Scheduled Task gateway restart and health issues is to address the known local loopback probe false negatives, cron/session state corruption, and post-ready HTTP instability through a combination of code changes, configuration adjustments, and potential updates to the OpenClaw version.

Guidance

Investigate and apply fixes from related issues: Review and apply relevant fixes or workarounds from issues #48771, #48801, #44920, #59511, and #60295 to address the loopback probe false negatives, cron stale state, and HTTP health instability.
Verify cron job and session state management: Ensure that cron jobs and session states are properly managed and cleaned up after restarts to prevent stale states from causing issues.
Monitor and adjust the gateway's restart probing: Consider adjusting the restart probing mechanism to treat loopback HTTP /health and local 1008 policy closes as healthy enough for restart probing, as the user's local patch suggests this reduces false negatives.
Upgrade OpenClaw and Node.js: Although the issue persists after upgrading to OpenClaw 2026.4.8 and Node.js 22.22.2, continue to monitor for updates that may address the underlying issues.

Example

No specific code snippet is provided due to the complexity and variability of the issue, but users can explore adjusting the restart probing logic as mentioned in the guidance section.

Notes

The issue seems to be a combination of multiple problems, making it challenging to provide a single, definitive fix. The guidance provided aims to help mitigate or workaround the known issues, but further investigation and potential updates to OpenClaw or its dependencies may be necessary for a complete resolution.

Recommendation

Apply workaround: Given the complexity and the fact that the issue involves multiple known problems, applying the workarounds and fixes from related issues and adjusting the configuration as suggested seems to be the most practical approach at this time.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

openclaw gateway restart should succeed when the restarted local gateway is already healthy enough to reject unauthenticated loopback probes
Scheduled Task runtime and port ownership should stay consistent
Cron startup should not preserve impossible stale running state
Once the gateway logs ready (...), /health and / should remain responsive instead of later hanging or disappearing

#dependency conflict #environment setup #docker error #permission error #memory optimization

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - 💡(How to fix) Fix [Bug]: Windows Scheduled Task gateway restart/health becomes inconsistent after ready [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

[Bug]: Windows Scheduled Task gateway restart/health becomes inconsistent after ready; mixes known probe false negatives with cron/session stale state and post-ready HTTP loss

Bug type

Beta release blocker

Summary

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Steps to reproduce

Expected behavior

Actual behavior

Related issues / likely overlap

What I found during local debugging

Impact and severity

Logs, screenshots, and evidence

Additional information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING