openclaw - 💡(How to fix) Fix Heartbeat scheduler silently stops dispatching polls after session compaction/recreation

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

The internal heartbeat scheduler silently stopped dispatching polls to agent:main:main for ~2h52m despite an active, responsive session and a stable gateway process. Heartbeats resumed only after a config.patch write to agents.defaults.heartbeat.every (same value), suggesting the scheduler was in a stopped state and the config file write re-initialized it.

Error Message

No explicit error messages were logged during the gap. The session transcript simply shows no [OpenClaw heartbeat poll] messages arriving between 18:44 and 21:36. The next poll after the gap correlates with the config.patch timestamp.

Root Cause

The internal heartbeat scheduler silently stopped dispatching polls to agent:main:main for ~2h52m despite an active, responsive session and a stable gateway process. Heartbeats resumed only after a config.patch write to agents.defaults.heartbeat.every (same value), suggesting the scheduler was in a stopped state and the config file write re-initialized it.

Fix Action

Fix / Workaround

The internal heartbeat scheduler silently stopped dispatching polls to agent:main:main for ~2h52m despite an active, responsive session and a stable gateway process. Heartbeats resumed only after a config.patch write to agents.defaults.heartbeat.every (same value), suggesting the scheduler was in a stopped state and the config file write re-initialized it.

TimeEvent
14:00–18:08Heartbeats every 10m, normal
18:03–18:07Session compacts 3x due to context pressure
18:08Last heartbeat on old session; compaction recovery in progress
18:19New session created
18:24–18:44Heartbeats resume normally in new session
18:44–21:36Silent gap — no heartbeat polls. Session active and processing direct messages.
21:36config.patch on agents.defaults.heartbeat.every (no-op value change)
21:44Heartbeats resume and continue normally

No explicit error messages were logged during the gap. The session transcript simply shows no [OpenClaw heartbeat poll] messages arriving between 18:44 and 21:36. The next poll after the gap correlates with the config.patch timestamp.

RAW_BUFFERClick to expand / collapse

Summary

The internal heartbeat scheduler silently stopped dispatching polls to agent:main:main for ~2h52m despite an active, responsive session and a stable gateway process. Heartbeats resumed only after a config.patch write to agents.defaults.heartbeat.every (same value), suggesting the scheduler was in a stopped state and the config file write re-initialized it.

Environment

  • OpenClaw: gateway running as systemd-style process (PID stable, no restart during incident)
  • Session: agent:main:main (direct/webchat)
  • Heartbeat config: agents.defaults.heartbeat.every: "10m" — always set, unchanged
  • Model: openrouter/openrouter/primary (402 billing exhausted), failover to openrouter/openrouter/owl-alpha

Timeline (UTC)

TimeEvent
14:00–18:08Heartbeats every 10m, normal
18:03–18:07Session compacts 3x due to context pressure
18:08Last heartbeat on old session; compaction recovery in progress
18:19New session created
18:24–18:44Heartbeats resume normally in new session
18:44–21:36Silent gap — no heartbeat polls. Session active and processing direct messages.
21:36config.patch on agents.defaults.heartbeat.every (no-op value change)
21:44Heartbeats resume and continue normally

Key Facts

  1. Gateway process never restarted (PID stable throughout).
  2. Session was fully functional during the gap — processing direct user messages, running tool calls, and producing responses. Only heartbeat polls were affected.
  3. The gap began 30 minutes into a new session (after 3 normal heartbeats on that session), not immediately at session creation.
  4. A config file write (even a no-op merge) re-triggered the scheduler.
  5. A manually created backup cron (10m interval, system event to main session) also did not fire during the gap, suggesting the issue extends beyond the per-session timer.

Possible Causes

  • Heartbeat timer bound to session object that got GC'd or orphaned during compaction, with the new session inheriting a stale handle.
  • Compaction风暴 (4 compacts in ~5 min) may have triggered an edge case in the scheduler's session-registration logic.
  • Config hot-reload on write re-initializes the global scheduler, which would explain the self-healing behavior.

Expected Behavior

Heartbeat scheduling should survive session recreation and compaction without requiring a config write.

Severity

Medium. Heartbeats are the primary in-band channel for an autonomous agent to self-report, check calendars, scan email, and surface issues to their human. A silent multi-hour gap with no logged error is a reliability concern.

Logs

No explicit error messages were logged during the gap. The session transcript simply shows no [OpenClaw heartbeat poll] messages arriving between 18:44 and 21:36. The next poll after the gap correlates with the config.patch timestamp.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Heartbeat scheduler silently stops dispatching polls after session compaction/recreation