openclaw - ✅(Solved) Fix Heartbeat: non-interval wake reasons bypass interval enforcement, causing bursts and silent gaps [2 pull requests, 2 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#62294Fetched 2026-04-08 03:06:29
View on GitHub
Comments
2
Participants
1
Timeline
7
Reactions
0
Participants
Timeline (top)
referenced ×3commented ×2cross-referenced ×2

Non-interval heartbeat wake reasons (exec-event, wake, hook, cron:*, background-task, notifications-event) bypass the per-agent interval gate entirely, causing two related problems:

  1. Burst runs — multiple heartbeats fire in rapid succession when activity resumes after idle periods
  2. Silent gaps — during true inactivity (no user messages, no exec completions), heartbeats stop entirely because the setTimeout-based interval timer appears to stall (macOS App Nap / process suspension), and no activity-driven events exist to compensate

Root Cause

  1. Burst runs — multiple heartbeats fire in rapid succession when activity resumes after idle periods
  2. Silent gaps — during true inactivity (no user messages, no exec completions), heartbeats stop entirely because the setTimeout-based interval timer appears to stall (macOS App Nap / process suspension), and no activity-driven events exist to compensate

Fix Action

Fixed

PR fix notes

PR #62308: fix(launchd): set ProcessType=Interactive to prevent macOS App Nap

Description (problem / solution / changelog)

Summary

  • Add ProcessType: Interactive to the generated launchd plist so macOS does not App Nap the gateway process
  • Prevents heartbeat timers from silently freezing during idle periods on macOS

Problem

The gateway's heartbeat interval timer uses setTimeout inside the Node.js event loop. On macOS, when the gateway has no I/O activity, App Nap suspends the process — freezing all timers. Heartbeats stop silently and only resume when user activity wakes the process.

Setting ProcessType to Interactive in the launchd plist tells macOS to treat the gateway as a foreground-quality process that needs timely execution, opting it out of App Nap.

Changes

  • src/daemon/launchd-plist.ts — add ProcessType: Interactive to the plist template
  • src/daemon/launchd.test.ts — add assertions for the new key

Test plan

  • Existing launchd install tests pass with new ProcessType assertions
  • Run openclaw gateway install on macOS and verify the generated plist at ~/Library/LaunchAgents/ai.openclaw.gateway.plist contains <key>ProcessType</key><string>Interactive</string>
  • Monitor heartbeat activity during extended idle periods to confirm timers fire on schedule

Refs: #62294

🤖 Generated with Claude Code

Changed files

  • src/daemon/launchd-plist.ts (modified, +1/-1)
  • src/daemon/launchd.test.ts (modified, +2/-0)

PR #62310: fix(heartbeat): enforce minimum interval for non-interval wake triggers

Description (problem / solution / changelog)

Summary

  • Enforce minimum interval between heartbeat runs for non-interval wake reasons (exec-event, hook, cron:*, wake, background-task, etc.)
  • Add drift detection in scheduleNext() so overdue timers fire immediately after process suspension

Problem

Non-interval heartbeat wake reasons bypass the timing gate in run():

if (isInterval && now < agent.nextDueMs) continue;

isInterval is only true for reason === "interval". All other reasons run immediately and call advanceAgentSchedule(), which pushes nextDueMs forward — effectively satisfying the interval. This causes:

  1. Burst runs — multiple events (exec completions, cron results, notifications) triggering in quick succession each fire a heartbeat, producing 15+ runs in 12 minutes
  2. Interval starvation — activity-driven heartbeats keep advancing the schedule so the interval timer never gets to fire on its own

Additionally, scheduleNext() doesn't detect when the timer is already overdue (e.g. from macOS App Nap), so stale timers schedule a 0ms setTimeout instead of firing immediately.

Changes

src/infra/heartbeat-runner.ts:

  • Add minimum-interval enforcement for targeted non-interval wakes (the requestedSessionKey || requestedAgentId path)
  • Add minimum-interval enforcement for fan-out non-interval wakes (the for...of state.agents loop)
  • Add overdue detection in scheduleNext() — if delay === 0, fire immediately instead of scheduling a timer
  • Add drift logging when timers fire >5s late

src/infra/heartbeat-runner.scheduler.test.ts:

  • Test: non-interval wake skipped when last run is within interval
  • Test: targeted non-interval wake skipped when last run is within interval
  • Test: overdue interval fires immediately (drift detection)

Test plan

  • Existing scheduler tests pass (config updates, error recovery, requests-in-flight, targeted routing, fan-out scoping)
  • New test: untargeted exec-event after recent interval run → skipped
  • New test: targeted exec-event with sessionKey after recent interval run → skipped
  • New test: 90min time jump triggers immediate heartbeat (overdue detection)
  • Manual: monitor activity-log.jsonl during active use — verify no burst of >1 heartbeat per interval

Refs: #62294 Companion to #62308 (launchd ProcessType fix)

🤖 Generated with Claude Code

Changed files

  • src/infra/heartbeat-runner.scheduler.test.ts (modified, +65/-0)
  • src/infra/heartbeat-runner.ts (modified, +42/-0)

Code Example

for (const agent of state.agents.values()) {
  if (isInterval && now < agent.nextDueMs) continue;  // ← only interval checks timing
  // ...
  advanceAgentSchedule(agent, now);  // resets nextDueMs = now + intervalMs
}

---

for (const agent of state.agents.values()) {
  if (isInterval && now < agent.nextDueMs) continue;
  // Add: enforce minimum spacing for non-interval triggers too
  if (!isInterval && typeof agent.lastRunMs === 'number' 
      && (now - agent.lastRunMs) < agent.intervalMs) continue;
  // ...
}

---

const delay = Math.max(0, nextDue - now);
// If we overslept (App Nap, suspension), fire immediately
if (delay === 0 && nextDue < now) {
  requestHeartbeatNow({ reason: "interval", coalesceMs: 0 });
  return;
}
RAW_BUFFERClick to expand / collapse

Summary

Non-interval heartbeat wake reasons (exec-event, wake, hook, cron:*, background-task, notifications-event) bypass the per-agent interval gate entirely, causing two related problems:

  1. Burst runs — multiple heartbeats fire in rapid succession when activity resumes after idle periods
  2. Silent gaps — during true inactivity (no user messages, no exec completions), heartbeats stop entirely because the setTimeout-based interval timer appears to stall (macOS App Nap / process suspension), and no activity-driven events exist to compensate

Environment

  • openclaw version: 2026.4.5
  • Platform: macOS (Darwin 24.5.0, Apple Silicon)
  • Mode: local gateway, Tailscale serve
  • Heartbeat config: defaults only (30m interval, no explicit heartbeat: block on any agent)

Observed Behavior

Over ~24 hours of activity-log.jsonl (a script that is called when the heartbeat runs) heartbeat entries:

MetricValue
Total heartbeats59
Expected at 30min intervals~47
Gaps > 45 min7
Largest gap489 min (overnight idle)
Burst example15 heartbeats in 12 minutes (23:45–23:59)

Heartbeats correlate almost perfectly with user/message activity. When the gateway has no inbound messages or exec completions, heartbeats stop. When interaction resumes, queued events drain and heartbeats burst.

Root Cause (code trace)

In heartbeat-runner run() (~line 1022 of the bundled dist):

for (const agent of state.agents.values()) {
  if (isInterval && now < agent.nextDueMs) continue;  // ← only interval checks timing
  // ...
  advanceAgentSchedule(agent, now);  // resets nextDueMs = now + intervalMs
}

isInterval is reason === "interval". All other wake reasons skip the time gate, run immediately, then call advanceAgentSchedule() which pushes nextDueMs forward — effectively "satisfying" the interval from the runner's perspective.

Meanwhile, the interval timer in scheduleNext() uses setTimeout(...).unref() (line 935). On macOS, when the gateway process is idle with no I/O, the OS can suspend the process (App Nap), delaying the timer indefinitely. There's no keepalive or system-level scheduler backing the interval.

Call sites that trigger non-interval heartbeats (all bypass the gate)

FileReasonTrigger
exec-defaultsexec-eventBackground exec notifyOnExit
server-node-eventsexec-eventNode exec completion
server-node-eventsnotifications-eventOS notification change
server (cron)cron:<id>Cron job completion
serverwakeRestart sentinel, wake mode
server (hooks)hook:wake, hook:<id>Hook triggers
runtime-internalbackground-taskTask completion/blocked

Expected Behavior

  1. Non-interval wake reasons should still respect a minimum interval between heartbeat runs (e.g., the configured interval or a floor like 60s), even if they're allowed to preempt the next scheduled run
  2. The interval timer should be resilient to process suspension — either via a system-level scheduler (launchd timer, etc.) or by checking wall-clock drift on any event loop wake and firing overdue heartbeats immediately
  3. advanceAgentSchedule on non-interval runs should not push the next interval-based run further into the future than it already was (i.e., nextDueMs = max(existing nextDueMs, now + intervalMs) rather than unconditionally resetting)

Suggested Fix

In run(), add a minimum-interval check for non-interval reasons:

for (const agent of state.agents.values()) {
  if (isInterval && now < agent.nextDueMs) continue;
  // Add: enforce minimum spacing for non-interval triggers too
  if (!isInterval && typeof agent.lastRunMs === 'number' 
      && (now - agent.lastRunMs) < agent.intervalMs) continue;
  // ...
}

And in scheduleNext(), add drift detection:

const delay = Math.max(0, nextDue - now);
// If we overslept (App Nap, suspension), fire immediately
if (delay === 0 && nextDue < now) {
  requestHeartbeatNow({ reason: "interval", coalesceMs: 0 });
  return;
}

Related

  • #52270 — interval timer re-arm after runner errors (fixed)
  • #40526 — skip wake delivery when session lane busy (related but different)
  • #42119 — heartbeat-only sessions and compaction

Reproduction

  1. Configure openclaw with default heartbeat (30m) on macOS
  2. Stop interacting with the gateway for 2+ hours
  3. Observe activity-log.jsonl (a script that is called when the heartbeat runs) — no heartbeat entries appear during the idle period
  4. Resume sending messages — observe a burst of heartbeats in quick succession
  5. Monitor the reason field in heartbeat events — non-interval reasons dominate

extent analysis

TL;DR

To fix the issue of burst runs and silent gaps in heartbeats, add a minimum-interval check for non-interval reasons and implement drift detection in the interval timer.

Guidance

  1. Enforce minimum interval: Modify the run() function to include a check for non-interval reasons, ensuring a minimum spacing between heartbeat runs.
  2. Implement drift detection: Update the scheduleNext() function to detect when the process has overslept due to App Nap or suspension, and fire the heartbeat immediately in such cases.
  3. Review related issues: Examine related issues (#52270, #40526, #42119) to ensure that the proposed fix does not introduce regressions or conflicts with existing solutions.
  4. Verify the fix: Reproduce the issue using the provided steps and verify that the modified code resolves the problems of burst runs and silent gaps.

Example

The suggested fix provides example code modifications:

// In run()
if (!isInterval && typeof agent.lastRunMs === 'number' 
    && (now - agent.lastRunMs) < agent.intervalMs) continue;

// In scheduleNext()
if (delay === 0 && nextDue < now) {
  requestHeartbeatNow({ reason: "interval", coalesceMs: 0 });
  return;
}

Notes

The fix assumes that the agent.lastRunMs property is updated correctly and that the intervalMs value is set to a reasonable minimum interval (e.g., 60s).

Recommendation

Apply the suggested workaround by modifying the run() and scheduleNext() functions as described, to address the issues of burst runs and silent gaps in heartbeats. This approach ensures that non-interval wake reasons respect a minimum interval and that the interval timer is resilient to process suspension.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING