openclaw - 💡(How to fix) Fix [Enhancement]: Bundled Windows watchdog template — fixed 8s post-launch sleep mislabels normal recovery as failure

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The Windows watchdog template that ships with OpenClaw (used to keep the silent gateway scheduled task alive between crashes) uses a static Start-Sleep -Seconds 8 after kicking off schtasks /run, then probes the gateway port once. On real hardware, node + plugin init can take ~10–15s under cold load, so the probe fires before the gateway is actually bound — producing log lines like still down after relaunch attempt even when the gateway comes up cleanly within the next second or two.

Cosmetic, but it makes the watchdog log misleading: every legitimate recovery looks like a half-failure. Caused me to debug a non-existent issue.

Root Cause

Watchdog log noise erodes signal — repeated false still down messages train the operator to ignore the line, including when it is a real failure.

Fix Action

Fix / Workaround

Suggested fix

Replace the static sleep with a poll loop. Reference patch (running well in practice for the past few hours):

$ok = $false
$elapsed = 0
for ($i = 0; $i -lt 12 -and -not $ok; $i++) {
  Start-Sleep -Seconds 2
  $elapsed += 2
  try {
    $r = Invoke-WebRequest -Uri \"http://127.0.0.1:$port/health\" -TimeoutSec 1 -UseBasicParsing
    if ($r.StatusCode -eq 200) { $ok = $true }
  } catch { }
}
if ($ok) {
  Log \"recovery CONFIRMED after ~${elapsed}s\"
} else {
  Log \"still down after relaunch attempt (24s window exhausted)\"
}

Worst-case wait stays at ~24s; logs now reflect reality.

Code Example

$ok = $false
$elapsed = 0
for ($i = 0; $i -lt 12 -and -not $ok; $i++) {
  Start-Sleep -Seconds 2
  $elapsed += 2
  try {
    $r = Invoke-WebRequest -Uri \"http://127.0.0.1:$port/health\" -TimeoutSec 1 -UseBasicParsing
    if ($r.StatusCode -eq 200) { $ok = $true }
  } catch { }
}
if ($ok) {
  Log \"recovery CONFIRMED after ~${elapsed}s\"
} else {
  Log \"still down after relaunch attempt (24s window exhausted)\"
}
RAW_BUFFERClick to expand / collapse

Summary

The Windows watchdog template that ships with OpenClaw (used to keep the silent gateway scheduled task alive between crashes) uses a static Start-Sleep -Seconds 8 after kicking off schtasks /run, then probes the gateway port once. On real hardware, node + plugin init can take ~10–15s under cold load, so the probe fires before the gateway is actually bound — producing log lines like still down after relaunch attempt even when the gateway comes up cleanly within the next second or two.

Cosmetic, but it makes the watchdog log misleading: every legitimate recovery looks like a half-failure. Caused me to debug a non-existent issue.

Environment

  • OpenClaw 2026.5.28
  • Windows 11 Pro 10.0.26200

Suggested fix

Replace the static sleep with a poll loop. Reference patch (running well in practice for the past few hours):

$ok = $false
$elapsed = 0
for ($i = 0; $i -lt 12 -and -not $ok; $i++) {
  Start-Sleep -Seconds 2
  $elapsed += 2
  try {
    $r = Invoke-WebRequest -Uri \"http://127.0.0.1:$port/health\" -TimeoutSec 1 -UseBasicParsing
    if ($r.StatusCode -eq 200) { $ok = $true }
  } catch { }
}
if ($ok) {
  Log \"recovery CONFIRMED after ~${elapsed}s\"
} else {
  Log \"still down after relaunch attempt (24s window exhausted)\"
}

Worst-case wait stays at ~24s; logs now reflect reality.

Why this matters

Watchdog log noise erodes signal — repeated false still down messages train the operator to ignore the line, including when it is a real failure.

Related

  • #44595 (Telegram polling watchdog — "tracks initiation, not success") — same class of pattern (single-shot probe vs. confirm) but a different watchdog.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING