openclaw - ✅(Solved) Fix Heartbeat duration `every` >24.85d overflows Node setTimeout, crashes gateway with no auto-respawn [1 pull requests, 1 participants]

openclaw2026-04-25 05:54:35

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#71414•Fetched 2026-04-26 05:13:06

View on GitHub

Comments

Participants

Timeline

Reactions

Author

mayank6136

Participants

mayank6136

Timeline (top)

referenced ×6cross-referenced ×5closed ×1

Setting agents.defaults.heartbeat.every to a duration greater than ~24.85 days (Node.js's signed-32-bit setTimeout cap of 2,147,483,647 ms) causes the heartbeat scheduler to fire in a tight loop and eventually crashes the gateway with OpenClaw exited with code 1. The container wrapper does not auto-respawn the gateway after exit. The CLI silently falls back to embedded mode on the next invocation.

Error Message

[22:29:40] WARN: OpenClaw exited with code 1

Reject the config with a clear error during loadConfig if the resolved ms exceeds Node's setTimeout limit.

Root Cause

Per Node.js docs, setTimeout(fn, delay) clamps delay > 2147483647 to 1 ms. The heartbeat scheduler appears to compute "next fire = now + every" and pass it directly to setTimeout, so the very-large delay gets truncated to 1 ms. The function then runs immediately, recomputes, and re-arms — a tight loop that ultimately exhausts something (event loop / promise queue / heap) and the process dies.

Fix Action

Workaround

Use a value safely under the cap: "every": "24d" (≈ 2,073,600,000 ms) is safe and still effectively "never" for a "durably disabled" heartbeat.

PR fix notes

PR #71478: fix(heartbeat): clamp scheduler delay to Node setTimeout cap (#71414)

Repository: openclaw/openclaw
Author: hclsys
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/71478

Description (problem / solution / changelog)

Closes #71414.

Bug

When agents.defaults.heartbeat.every resolves to >2_147_483_647 ms (~24.85d), scheduleNext() in src/infra/heartbeat-runner.ts called setTimeout(fn, delay) with the raw oversized delay. Node clamps any delay > 2^31-1 to 1 ms, fires the callback, and the heartbeat re-arms with the same oversized value — a tight loop that floods logs with TimeoutOverflowWarning: ... Timeout duration was set to 1. and crashes the gateway with exit code 1.

Reproduces with the reporter's recipe: { "agents": { "defaults": { "heartbeat": { "every": "365d" } } } }.

Fix

Clamp the computed delay to HEARTBEAT_MAX_TIMEOUT_MS = 2_147_483_647 ms before calling setTimeout. Worst case is now one heartbeat every ~24.85d instead of crash-loop. Warn once per process when the clamp fires, so a misconfigured 365d is still visible without flooding logs.

This is a defense-in-depth fix at the scheduler layer. loadConfig-level rejection (suggested in the issue) is a broader change with more blast radius and a separate semantic question — some users likely want every: 365d to mean "effectively never", and the clamped behaviour matches that intent better than a hard error does.

Test

New src/infra/heartbeat-runner.scheduler.test.ts case: sets heartbeat.every: \"365d\" with fake timers, advances 60s, and asserts runSpy was never invoked. With the bug present, runSpy would have been called tens of thousands of times during the advance.

Lint clean: pnpm oxlint src/infra/heartbeat-runner.ts src/infra/heartbeat-runner.scheduler.test.ts — 0 warnings, 0 errors.

Out of scope (deliberately)

Wrapper/supervisor auto-respawn after gateway exit code 1 (mentioned in the issue) — that lives in container/wrapper code, separate concern.
CLI silent embedded-mode fallback — tracked separately at #71416.

🤖 generated with assistance from Claude Code Co-authored-by: HCL [email protected]

Changed files

CHANGELOG.md (modified, +1/-0)
src/agents/runtime-auth-refresh.ts (modified, +2/-2)
src/gateway/call.ts (modified, +2/-1)
src/gateway/client.ts (modified, +3/-2)
src/gateway/probe.ts (modified, +3/-2)
src/gateway/server-chat.ts (modified, +3/-3)
src/gateway/server-methods/agent-job.ts (modified, +4/-5)
src/gateway/server-methods/agent-wait-dedupe.ts (modified, +2/-2)
src/infra/heartbeat-runner.scheduler.test.ts (modified, +18/-0)
src/infra/heartbeat-runner.timeout-warning.test.ts (added, +70/-0)
src/infra/heartbeat-runner.ts (modified, +11/-1)
src/utils/timer-delay.test.ts (added, +34/-0)
src/utils/timer-delay.ts (added, +19/-0)

Code Example

{ "agents": { "defaults": { "heartbeat": { "every": "365d" } } } }

---

(node:41) TimeoutOverflowWarning: 23111245866 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.

---

[22:29:40] WARN: OpenClaw exited with code 1

RAW_BUFFERClick to expand / collapse

Summary

Reproduction

In openclaw.json, set:
```
{ "agents": { "defaults": { "heartbeat": { "every": "365d" } } } }
```
(or any value that resolves to >2,147,483,647 ms — i.e. anything beyond ~24d 20h)
Restart the gateway and watch container logs.

Observed behaviour

Container logs flood with thousands of lines like:

(node:41) TimeoutOverflowWarning: 23111245866 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.

Eventually the gateway exits:

[22:29:40] WARN: OpenClaw exited with code 1

After this, port 18789 is not listening; subsequent openclaw agent invocations silently fall back to embedded mode (see related issue on silent embedded fallback). docker exec ... ps -ef shows no gateway process; only the proxy node server.mjs remains.

Root cause

Expected behaviour

At least one of:

Reject the config with a clear error during loadConfig if the resolved ms exceeds Node's setTimeout limit.
Clamp internally to 2^31-1 ms with a warning.
Use a recursive long-timer pattern (re-arm every 24d until the cumulative target is reached).

Additionally, the wrapper / supervisor should auto-respawn the gateway after exit code 1 instead of leaving the proxy alive but the gateway dead.

Workaround

Use a value safely under the cap: "every": "24d" (≈ 2,073,600,000 ms) is safe and still effectively "never" for a "durably disabled" heartbeat.

Environment

OpenClaw 2026.4.12 (1c0672b), running in container ghcr.io/hostinger/hvps-openclaw:latest on Linux/Docker.

extent analysis

TL;DR

Set agents.defaults.heartbeat.every to a value less than or equal to 24 days to prevent the heartbeat scheduler from crashing the gateway.

Guidance

Verify that the agents.defaults.heartbeat.every value is not exceeding the Node.js setTimeout limit of 2,147,483,647 ms.
Use a value safely under the cap, such as "every": "24d", to effectively disable the heartbeat without crashing the gateway.
Consider implementing a recursive long-timer pattern to handle durations greater than the setTimeout limit.
Check the container wrapper configuration to ensure it auto-respawns the gateway after an exit code 1.

Example

{
  "agents": {
    "defaults": {
      "heartbeat": {
        "every": "24d"
      }
    }
  }
}

Notes

The current implementation of the heartbeat scheduler does not handle durations greater than the setTimeout limit, leading to a tight loop and eventual crash. A workaround is to use a value under the cap, but a more robust solution would involve modifying the scheduler to handle longer durations.

Recommendation

Apply the workaround by setting agents.defaults.heartbeat.every to a value less than or equal to 24 days, as this is a safe and effective way to prevent the gateway from crashing.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#mixed precision #training loop #device allocation #model download #tokenizer error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Heartbeat duration `every` >24.85d overflows Node setTimeout, crashes gateway with no auto-respawn [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

PR fix notes

PR #71478: fix(heartbeat): clamp scheduler delay to Node setTimeout cap (#71414)

Description (problem / solution / changelog)

Bug

Fix

Test

Out of scope (deliberately)

Changed files

Code Example

Summary

Reproduction

Observed behaviour

Root cause

Expected behaviour

Workaround

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING