openclaw - ✅(Solved) Fix [Bug]: Heartbeat / async system events can interrupt and effectively swallow in-progress replies in Telegram topic sessions [2 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#64810Fetched 2026-04-12 13:26:39
View on GitHub
Comments
2
Participants
3
Timeline
5
Reactions
0
Timeline (top)
commented ×2cross-referenced ×2referenced ×1

In Telegram forum-topic sessions, heartbeat/system-event turns can preempt an in-progress user reply and make the original answer effectively disappear from the user's perspective.

This is not just heartbeat noise. The more serious effect is that a normal user question can get interrupted by heartbeat polls or async System: completion events before the assistant sends its final reply, and the unfinished reply does not automatically resume.

Root Cause

Relevant issues I found:

  • #43168 Heartbeat wakeups are persisted into main dashboard session as synthetic user messages
  • #60926 Heartbeat injects into active sub-agent sessions, terminating them with HEARTBEAT_OK
  • #46798 heartbeat sessions can re-trigger on local exec completed events and spam duplicate heartbeat log entries
  • #60207 Heartbeat and normal replies use parallel outbound paths without shared cross-source dedupe
  • #52305 async task completion reports can be lost because system event/wake is not reliably session-targeted

Fix Action

Fix / Workaround

Repro pattern

  1. User sends a normal message in a Telegram topic.
  2. Assistant starts a multi-step response that uses tools or takes long enough to still be in progress.
  3. Before the assistant sends the final user-visible reply, one of these arrives in the same session lane:
    • a heartbeat poll turn
    • an async exec completion / System: tool-completion event
    • a restart/config-patch completion event
  4. The new turn runs first.
  5. The original unfinished user reply does not automatically resume or get replayed.

Current mitigation used locally

I mitigated this locally by changing heartbeat config to:

  • agents.defaults.heartbeat.target = "none"
  • agents.defaults.heartbeat.isolatedSession = true
  • agents.defaults.heartbeat.lightContext = true

That reduces the visible problem a lot, but it feels like a workaround, not a real fix.

PR fix notes

PR #64823: fix: avoid heartbeat preempting active reply runs

Description (problem / solution / changelog)

Summary

  • skip heartbeat execution when the target session still has an active reply run, even if the command lane has already drained
  • cover the regression with a heartbeat runner test that simulates a live reply run plus a queued system event

Testing

  • node scripts/test-projects.mjs src/infra/heartbeat-runner.skips-busy-session-lane.test.ts

Issue

  • fixes #64810

Changed files

  • src/infra/heartbeat-runner.skips-busy-session-lane.test.ts (modified, +57/-0)
  • src/infra/heartbeat-runner.ts (modified, +15/-0)

PR #64963: fix(heartbeat): skip heartbeat execution while a reply run is active

Description (problem / solution / changelog)

Summary

Resubmission of #64823 (auto-closed ~1 minute after filing by the active-PR limit bot despite a Greptile 5/5 review). Code is unchanged; authorship preserved via cherry-pick so @EronFan / aoao is credited in git history.

Adds a guard in runHeartbeatOnce (src/infra/heartbeat-runner.ts) that skips heartbeat execution when resolveActiveReplyRunSessionId(sessionKey) returns truthy. Placed after preflight resolves the session key and before the existing session-lane queue check. Symmetric with the lane-busy skip path — emits the same heartbeat event and returns { status: "skipped", reason: "requests-in-flight" } so the wake-layer retry re-schedules automatically.

Why the existing lane-busy check is insufficient

A reply run can remain active for a session even after the command lane itself has drained, for example while the active assistant turn is still finishing provider/output cleanup. In that window, a heartbeat or async system-event wake landing on the same session lane races the user-visible reply and can effectively swallow it — the original turn never replays and the user sees no final answer. This is the class described in #64810 and reproduced by @jackiedepp + @EronFan.

Changes

  • src/infra/heartbeat-runner.ts: +15 lines. New import of resolveActiveReplyRunSessionId from auto-reply/reply/reply-run-registry.js plus the guard block inside runHeartbeatOnce before the existing sessionLaneKey queue check.
  • src/infra/heartbeat-runner.skips-busy-session-lane.test.ts: +57 lines. New regression test that seeds a main session, queues a system event on it, starts a live reply operation in "running" phase, and asserts the heartbeat runner skips with requests-in-flight without invoking the reply spy.

Total: 2 files, +72 / -0. Identical to #64823.

Testing

node scripts/test-projects.mjs src/infra/heartbeat-runner.skips-busy-session-lane.test.ts

(Same test invocation as the original PR.)

Code-level exposure confirmation

Verified on an Ubuntu 24.04 VPS deployment running v2026.4.9 (0512059) with agents.defaults.heartbeat.every: "1h" and Telegram as the primary channel:

  • resolveActiveReplyRunSessionId is exported from the reply-run registry in the installed bundle
  • it is not referenced from the heartbeat-runner module
  • runHeartbeatOnce only guards on getQueueSize(sessionLaneKey)

So the guard is absent on v2026.4.9, and the symbol needed to add it is already available in that bundle — this is a direct-hit class on hosts that share that configuration shape.

Fixes

Fixes #64810 Supersedes #64823 (auto-closed by PR-limit bot)

Credits

  • @jackiedepp — original bug report with clean repro (#64810)
  • @EronFan / aoao — root-cause analysis and the fix + regression test (#64823, preserved as commit author here)

Opening this because the original PR is mechanically closed and the memory rule I operate by is: when a fix is small, well-reviewed, and we can credit the original author cleanly, resubmit rather than leave the code orphaned in a closed PR. No code change from me.

Changed files

  • src/infra/heartbeat-runner.skips-busy-session-lane.test.ts (modified, +57/-0)
  • src/infra/heartbeat-runner.ts (modified, +15/-0)
RAW_BUFFERClick to expand / collapse

Summary

In Telegram forum-topic sessions, heartbeat/system-event turns can preempt an in-progress user reply and make the original answer effectively disappear from the user's perspective.

This is not just heartbeat noise. The more serious effect is that a normal user question can get interrupted by heartbeat polls or async System: completion events before the assistant sends its final reply, and the unfinished reply does not automatically resume.

Environment

  • OpenClaw: reproduced across 2026.4.8, 2026.4.9, and 2026.4.10
  • Install: npm/pnpm global install on Ubuntu VPS
  • Surface: Telegram group forum topic session
  • Session shape: persistent main session bound to a Telegram topic

Repro pattern

  1. User sends a normal message in a Telegram topic.
  2. Assistant starts a multi-step response that uses tools or takes long enough to still be in progress.
  3. Before the assistant sends the final user-visible reply, one of these arrives in the same session lane:
    • a heartbeat poll turn
    • an async exec completion / System: tool-completion event
    • a restart/config-patch completion event
  4. The new turn runs first.
  5. The original unfinished user reply does not automatically resume or get replayed.

User-visible result

From the user's point of view, it looks like:

  • the assistant "stopped replying"
  • a HEARTBEAT_OK or unrelated system completion appears instead
  • the original answer is lost unless the user asks again

Why this feels distinct from existing heartbeat issues

There are already related issues about heartbeat/session contamination, but this one is specifically about user reply loss/preemption in the same active Telegram topic session.

Relevant issues I found:

  • #43168 Heartbeat wakeups are persisted into main dashboard session as synthetic user messages
  • #60926 Heartbeat injects into active sub-agent sessions, terminating them with HEARTBEAT_OK
  • #46798 heartbeat sessions can re-trigger on local exec completed events and spam duplicate heartbeat log entries
  • #60207 Heartbeat and normal replies use parallel outbound paths without shared cross-source dedupe
  • #52305 async task completion reports can be lost because system event/wake is not reliably session-targeted

This issue is about the interaction of those classes of bugs with a live user-facing Telegram topic conversation, where heartbeat/system events appear to win the lane and the interrupted reply never comes back.

Current mitigation used locally

I mitigated this locally by changing heartbeat config to:

  • agents.defaults.heartbeat.target = "none"
  • agents.defaults.heartbeat.isolatedSession = true
  • agents.defaults.heartbeat.lightContext = true

That reduces the visible problem a lot, but it feels like a workaround, not a real fix.

Expected behavior

Any of these would be acceptable:

  1. Heartbeat/system-event turns should never preempt an in-progress user-visible reply in the same active session/thread/topic.
  2. If preemption happens, the unfinished reply should automatically resume or be replayed after the system event turn completes.
  3. Async/system completions should be routed into a separate non-preemptive lane for active user conversations.

Actual behavior

Heartbeat/system-event turns can occupy the same conversation lane and cause the original in-progress answer to vanish.

Extra note

Recent routing fixes that correctly send cron/subagent/restart messages back to the original thread/topic may be making this more visible on Telegram topics, because the noisy system events now land in the right place instead of disappearing elsewhere.

extent analysis

TL;DR

Adjusting the heartbeat configuration to prevent preemption of in-progress user replies in the same active session may mitigate the issue.

Guidance

  • Review the current mitigation used locally, which involves changing heartbeat config to agents.defaults.heartbeat.target = "none", agents.defaults.heartbeat.isolatedSession = true, and agents.defaults.heartbeat.lightContext = true, to understand how it reduces the visible problem.
  • Investigate the interaction between heartbeat/system events and live user-facing Telegram topic conversations to identify potential causes of preemption.
  • Consider implementing a mechanism to automatically resume or replay unfinished replies after system event turns complete, as an alternative to preventing preemption.
  • Examine recent routing fixes that correctly send cron/subagent/restart messages back to the original thread/topic, as they may be contributing to the increased visibility of the issue.

Example

No specific code snippet is provided, as the issue is more related to configuration and system behavior.

Notes

The provided mitigation is a workaround, and a more robust solution may be required to fully address the issue. The recent routing fixes may be making the issue more visible, but they are not the root cause.

Recommendation

Apply the workaround by adjusting the heartbeat configuration, as it has been shown to reduce the visible problem, while continuing to investigate a more permanent solution to prevent preemption of in-progress user replies.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Any of these would be acceptable:

  1. Heartbeat/system-event turns should never preempt an in-progress user-visible reply in the same active session/thread/topic.
  2. If preemption happens, the unfinished reply should automatically resume or be replayed after the system event turn completes.
  3. Async/system completions should be routed into a separate non-preemptive lane for active user conversations.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING