openclaw - ✅(Solved) Fix Heartbeat duration `every` >24.85d overflows Node setTimeout, crashes gateway with no auto-respawn [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#71414Fetched 2026-04-26 05:13:06
View on GitHub
Comments
0
Participants
1
Timeline
12
Reactions
0
Participants
Timeline (top)
referenced ×6cross-referenced ×5closed ×1

Setting agents.defaults.heartbeat.every to a duration greater than ~24.85 days (Node.js's signed-32-bit setTimeout cap of 2,147,483,647 ms) causes the heartbeat scheduler to fire in a tight loop and eventually crashes the gateway with OpenClaw exited with code 1. The container wrapper does not auto-respawn the gateway after exit. The CLI silently falls back to embedded mode on the next invocation.

Error Message

[22:29:40] WARN: OpenClaw exited with code 1

  • Reject the config with a clear error during loadConfig if the resolved ms exceeds Node's setTimeout limit.

Root Cause

Per Node.js docs, setTimeout(fn, delay) clamps delay > 2147483647 to 1 ms. The heartbeat scheduler appears to compute "next fire = now + every" and pass it directly to setTimeout, so the very-large delay gets truncated to 1 ms. The function then runs immediately, recomputes, and re-arms — a tight loop that ultimately exhausts something (event loop / promise queue / heap) and the process dies.

Fix Action

Workaround

Use a value safely under the cap: "every": "24d" (≈ 2,073,600,000 ms) is safe and still effectively "never" for a "durably disabled" heartbeat.

PR fix notes

PR #71478: fix(heartbeat): clamp scheduler delay to Node setTimeout cap (#71414)

Description (problem / solution / changelog)

Closes #71414.

Bug

When agents.defaults.heartbeat.every resolves to >2_147_483_647 ms (~24.85d), scheduleNext() in src/infra/heartbeat-runner.ts called setTimeout(fn, delay) with the raw oversized delay. Node clamps any delay > 2^31-1 to 1 ms, fires the callback, and the heartbeat re-arms with the same oversized value — a tight loop that floods logs with TimeoutOverflowWarning: ... Timeout duration was set to 1. and crashes the gateway with exit code 1.

Reproduces with the reporter's recipe: { "agents": { "defaults": { "heartbeat": { "every": "365d" } } } }.

Fix

Clamp the computed delay to HEARTBEAT_MAX_TIMEOUT_MS = 2_147_483_647 ms before calling setTimeout. Worst case is now one heartbeat every ~24.85d instead of crash-loop. Warn once per process when the clamp fires, so a misconfigured 365d is still visible without flooding logs.

This is a defense-in-depth fix at the scheduler layer. loadConfig-level rejection (suggested in the issue) is a broader change with more blast radius and a separate semantic question — some users likely want every: 365d to mean "effectively never", and the clamped behaviour matches that intent better than a hard error does.

Test

New src/infra/heartbeat-runner.scheduler.test.ts case: sets heartbeat.every: \"365d\" with fake timers, advances 60s, and asserts runSpy was never invoked. With the bug present, runSpy would have been called tens of thousands of times during the advance.

Lint clean: pnpm oxlint src/infra/heartbeat-runner.ts src/infra/heartbeat-runner.scheduler.test.ts — 0 warnings, 0 errors.

Out of scope (deliberately)

  • Wrapper/supervisor auto-respawn after gateway exit code 1 (mentioned in the issue) — that lives in container/wrapper code, separate concern.
  • CLI silent embedded-mode fallback — tracked separately at #71416.

🤖 generated with assistance from Claude Code Co-authored-by: HCL [email protected]

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/runtime-auth-refresh.ts (modified, +2/-2)
  • src/gateway/call.ts (modified, +2/-1)
  • src/gateway/client.ts (modified, +3/-2)
  • src/gateway/probe.ts (modified, +3/-2)
  • src/gateway/server-chat.ts (modified, +3/-3)
  • src/gateway/server-methods/agent-job.ts (modified, +4/-5)
  • src/gateway/server-methods/agent-wait-dedupe.ts (modified, +2/-2)
  • src/infra/heartbeat-runner.scheduler.test.ts (modified, +18/-0)
  • src/infra/heartbeat-runner.timeout-warning.test.ts (added, +70/-0)
  • src/infra/heartbeat-runner.ts (modified, +11/-1)
  • src/utils/timer-delay.test.ts (added, +34/-0)
  • src/utils/timer-delay.ts (added, +19/-0)

Code Example

{ "agents": { "defaults": { "heartbeat": { "every": "365d" } } } }

---

(node:41) TimeoutOverflowWarning: 23111245866 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.

---

[22:29:40] WARN: OpenClaw exited with code 1
RAW_BUFFERClick to expand / collapse

Summary

Setting agents.defaults.heartbeat.every to a duration greater than ~24.85 days (Node.js's signed-32-bit setTimeout cap of 2,147,483,647 ms) causes the heartbeat scheduler to fire in a tight loop and eventually crashes the gateway with OpenClaw exited with code 1. The container wrapper does not auto-respawn the gateway after exit. The CLI silently falls back to embedded mode on the next invocation.

Reproduction

  1. In openclaw.json, set:
    { "agents": { "defaults": { "heartbeat": { "every": "365d" } } } }
    (or any value that resolves to >2,147,483,647 ms — i.e. anything beyond ~24d 20h)
  2. Restart the gateway and watch container logs.

Observed behaviour

Container logs flood with thousands of lines like:

(node:41) TimeoutOverflowWarning: 23111245866 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.

Eventually the gateway exits:

[22:29:40] WARN: OpenClaw exited with code 1

After this, port 18789 is not listening; subsequent openclaw agent invocations silently fall back to embedded mode (see related issue on silent embedded fallback). docker exec ... ps -ef shows no gateway process; only the proxy node server.mjs remains.

Root cause

Per Node.js docs, setTimeout(fn, delay) clamps delay > 2147483647 to 1 ms. The heartbeat scheduler appears to compute "next fire = now + every" and pass it directly to setTimeout, so the very-large delay gets truncated to 1 ms. The function then runs immediately, recomputes, and re-arms — a tight loop that ultimately exhausts something (event loop / promise queue / heap) and the process dies.

Expected behaviour

At least one of:

  • Reject the config with a clear error during loadConfig if the resolved ms exceeds Node's setTimeout limit.
  • Clamp internally to 2^31-1 ms with a warning.
  • Use a recursive long-timer pattern (re-arm every 24d until the cumulative target is reached).

Additionally, the wrapper / supervisor should auto-respawn the gateway after exit code 1 instead of leaving the proxy alive but the gateway dead.

Workaround

Use a value safely under the cap: "every": "24d" (≈ 2,073,600,000 ms) is safe and still effectively "never" for a "durably disabled" heartbeat.

Environment

OpenClaw 2026.4.12 (1c0672b), running in container ghcr.io/hostinger/hvps-openclaw:latest on Linux/Docker.

extent analysis

TL;DR

Set agents.defaults.heartbeat.every to a value less than or equal to 24 days to prevent the heartbeat scheduler from crashing the gateway.

Guidance

  • Verify that the agents.defaults.heartbeat.every value is not exceeding the Node.js setTimeout limit of 2,147,483,647 ms.
  • Use a value safely under the cap, such as "every": "24d", to effectively disable the heartbeat without crashing the gateway.
  • Consider implementing a recursive long-timer pattern to handle durations greater than the setTimeout limit.
  • Check the container wrapper configuration to ensure it auto-respawns the gateway after an exit code 1.

Example

{
  "agents": {
    "defaults": {
      "heartbeat": {
        "every": "24d"
      }
    }
  }
}

Notes

The current implementation of the heartbeat scheduler does not handle durations greater than the setTimeout limit, leading to a tight loop and eventual crash. A workaround is to use a value under the cap, but a more robust solution would involve modifying the scheduler to handle longer durations.

Recommendation

Apply the workaround by setting agents.defaults.heartbeat.every to a value less than or equal to 24 days, as this is a safe and effective way to prevent the gateway from crashing.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Heartbeat duration `every` >24.85d overflows Node setTimeout, crashes gateway with no auto-respawn [1 pull requests, 1 participants]