openclaw - 💡(How to fix) Fix Heartbeat-spawned claude live session captures channel user inbounds, causing context-amnesiac fork replies [1 comments, 2 participants]

danielcrick · 2026-05-19T21:48:44Z

[openclaw] OpenClaw's heartbeat mechanism trigger=heartbeat , fires every ~15 minutes on the gateway host spawns a second long-lived claude live session alongs… OpenClaw's heartbeat mechanism (`trigger=heartbeat`, fires every ~15 minutes on the gateway host) spawns a **second** long-lived `claude live session` alongside an active channel session, taking the gateway's `activeSessions` count from 1 to 2. When this second session is alive (which is most of the time after the first heartbeat post-startup), user inbounds on the active WhatsApp / iMessage / Signal channel can route to **either** of the two live sessions on a request-by-request basis. The heartbeat-spawned session boots with a clean, freshly-loaded tool set and no conversation history from the active channel thread. Inbounds that land on it therefore reply as if the entire prior conversation never happened — typically with patterns like *"I don't have file access this turn"*, *"only messaging tools available"*, *"I'm not sure what you mean"*, etc. User-facing: *"oh my fucking god you have lost context again there's been another fork in the conversation"* (Dan, 2026-05-19 19:01 BST, and a similar message five+ times subsequently across a single evening). --- ## Fix / Workaround **Filed by**: Daniel Crick (`danielcrick`) — 2026-05-19, drafted 19:35 BST, updated 22:50 BST with root-cause analysis **Affected version**: OpenClaw `2026.5.3-1` (`2eae30e`) — post-downgrade per #83491 **Severity**: High — user-visible conversation forks, lost context, apparent regressions in production WhatsApp threads. Reproduced **four+ times in a single evening** including AFTER a clean Mac restart with `agent:main:main.cliSessionBindings` cleared. **Status**: Diagnosed. No local fix available. Workaround is to accept transient fork replies; root fix needs to happen in OpenClaw's heartbeat / live-session-router code. 1. **Bind heartbeat-spawned sessions to a dedicated session-key that is NEVER eligible for channel inbound routing.** A `agent:main:heartbeat` key, or similar, that the gateway explicitly excludes from `reuse=reusable` matching when dispatching user inbounds from channel sessions. The current "any live session can be reused" routing is too permissive. 3. **Diagnostic logging at the routing decision point.** Emit a structured log line at the moment the gateway decides which live session to dispatch a user inbound to: source channel key, target sessionKey, reason, alternative options considered. Even without the behavioural fix, this would let operators detect mis-routings immediately rather than catching them via user complaint. # Bug report — Heartbeat trigger spawns a parallel `claude live session` that captures user inbounds, causing duplicated / conflicting replies on a busy channel session **Filed by**: Daniel Crick (`danielcrick`) — 2026-05-19, drafted 19:35 BST, updated 22:50 BST with root-cause analysis **Affected version**: OpenClaw `2026.5.3-1` (`2eae30e`) — post-downgrade per #83491 **Severity**: High — user-visible conversation forks, lost context, apparent regressions in production WhatsApp threads. Reproduced **four+ times in a single evening** including AFTER a clean Mac restart with `agent:main:main.cliSessionBindings` cleared. **Status**: Diagnosed. No local fix available. Workaround is to accept transient fork replies; root fix needs to happen in OpenClaw's heartbeat / live-session-router code. --- ## Summary OpenClaw's heartbeat mechanism (`trigger=heartbeat`, fires every ~15 minutes on the gateway host) spawns a **second** long-lived `claude live session` alongside an active channel session, taking the gateway's `activeSessions` count from 1 to 2. When this second session is alive (which is most of the time after the first heartbeat post-startup), user inbounds on the active WhatsApp / iMessage / Signal channel can route to **either** of the two live sessions on a request-by-request basis. The heartbeat-spawned session boots with a clean, freshly-loaded tool set and no conversation history from the active channel thread. Inbounds that land on it therefore reply as if the entire prior conversation never happened — typically with patterns like *"I don't have file access this turn"*, *"only messaging tools available"*, *"I'm not sure what you mean"*, etc. User-facing: *"oh my fucking god you have lost context again there's been another fork in the conversation"* (Dan, 2026-05-19 19:01 BST, and a similar message five+ times subsequently across a single evening). --- ## What we initially thought (incorrect) Earlier hypothesis: cron-triggered `agent:main:main` runs primed a stale CLI binding (`cliSessionBindings.claude-cli.sessionId`) which the gateway then used as a routing fallback when the WhatsApp DM session was mid-turn. We: 1. Cleared `agent:main:main.cliSessionBindings`, `cliSessionIds`, `claudeCliSessionId`, `sessionFile` 2. Restarted the Mac (launchd-respawned gateway started clean) 3. Confirmed `agent:main:main.cliSessionBindings = {}

openclaw2026-05-19 21:48:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#84332•Fetched 2026-05-20 03:41:24

View on GitHub

Comments

Participants

Timeline

Reactions

Author

danielcrick

Participants

clawsweeper[bot]

danielcrick

Timeline (top)

labeled ×8cross-referenced ×2closed ×1commented ×1

OpenClaw's heartbeat mechanism (trigger=heartbeat, fires every ~15 minutes on the gateway host) spawns a second long-lived claude live session alongside an active channel session, taking the gateway's activeSessions count from 1 to 2. When this second session is alive (which is most of the time after the first heartbeat post-startup), user inbounds on the active WhatsApp / iMessage / Signal channel can route to either of the two live sessions on a request-by-request basis.

The heartbeat-spawned session boots with a clean, freshly-loaded tool set and no conversation history from the active channel thread. Inbounds that land on it therefore reply as if the entire prior conversation never happened — typically with patterns like "I don't have file access this turn", "only messaging tools available", "I'm not sure what you mean", etc.

User-facing: "oh my fucking god you have lost context again there's been another fork in the conversation" (Dan, 2026-05-19 19:01 BST, and a similar message five+ times subsequently across a single evening).

Root Cause

Actual root cause (confirmed via gateway log analysis)

Fix Action

Fix / Workaround

Filed by: Daniel Crick (danielcrick) — 2026-05-19, drafted 19:35 BST, updated 22:50 BST with root-cause analysis Affected version: OpenClaw 2026.5.3-1 (2eae30e) — post-downgrade per #83491 Severity: High — user-visible conversation forks, lost context, apparent regressions in production WhatsApp threads. Reproduced four+ times in a single evening including AFTER a clean Mac restart with agent:main:main.cliSessionBindings cleared. Status: Diagnosed. No local fix available. Workaround is to accept transient fork replies; root fix needs to happen in OpenClaw's heartbeat / live-session-router code.

Bind heartbeat-spawned sessions to a dedicated session-key that is NEVER eligible for channel inbound routing. A agent:main:heartbeat key, or similar, that the gateway explicitly excludes from reuse=reusable matching when dispatching user inbounds from channel sessions. The current "any live session can be reused" routing is too permissive.
Diagnostic logging at the routing decision point. Emit a structured log line at the moment the gateway decides which live session to dispatch a user inbound to: source channel key, target sessionKey, reason, alternative options considered. Even without the behavioural fix, this would let operators detect mis-routings immediately rather than catching them via user complaint.

Code Example

2026-05-19T21:00:01.456+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=1
2026-05-19T21:08:54.384+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=1
2026-05-19T21:11:56.755+01:00 [agent/cli-backend] claude live session reuse:  provider=claude-cli model=claude-opus-4-7
...
2026-05-19T21:30:01.471+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=2  ← HEARTBEAT
2026-05-19T21:37:45.204+01:00 [agent/cli-backend] claude live session reuse:  provider=claude-cli model=claude-opus-4-7
2026-05-19T21:45:25.378+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=2  ← HEARTBEAT
...
2026-05-19T22:15:25.396+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=2  ← HEARTBEAT

---

PID 59742  PPID 18019  age 31m50s  claude --resume e0a54390-…  ← WhatsApp DM session
PID 60488  PPID 18019  age  1m39s  claude --resume c3f55782-…  ← heartbeat-spawned fork

RAW_BUFFERClick to expand / collapse

Bug report — Heartbeat trigger spawns a parallel `claude live session` that captures user inbounds, causing duplicated / conflicting replies on a busy channel session

Summary

What we initially thought (incorrect)

Earlier hypothesis: cron-triggered agent:main:main runs primed a stale CLI binding (cliSessionBindings.claude-cli.sessionId) which the gateway then used as a routing fallback when the WhatsApp DM session was mid-turn. We:

Cleared agent:main:main.cliSessionBindings, cliSessionIds, claudeCliSessionId, sessionFile
Restarted the Mac (launchd-respawned gateway started clean)
Confirmed agent:main:main.cliSessionBindings = {} survived the reboot

The fork bug recurred within minutes of the restart, on a clean sessions.json. The stale-binding hypothesis was wrong.

We then cleared CLI bindings on 17 additional sessions (cron, subagent, test-tools — all sessions in sessions.json other than the active WhatsApp DM). Forks continued. That ruled out stale-binding fallback as the cause.

Actual root cause (confirmed via gateway log analysis)

Gateway log ~/.openclaw/logs/gateway.log shows the pattern unambiguously:

2026-05-19T21:00:01.456+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=1
2026-05-19T21:08:54.384+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=1
2026-05-19T21:11:56.755+01:00 [agent/cli-backend] claude live session reuse:  provider=claude-cli model=claude-opus-4-7
...
2026-05-19T21:30:01.471+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=2  ← HEARTBEAT
2026-05-19T21:37:45.204+01:00 [agent/cli-backend] claude live session reuse:  provider=claude-cli model=claude-opus-4-7
2026-05-19T21:45:25.378+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=2  ← HEARTBEAT
...
2026-05-19T22:15:25.396+01:00 [agent/cli-backend] claude live session start: provider=claude-cli model=claude-opus-4-7 activeSessions=2  ← HEARTBEAT

The claude live session start events with activeSessions=2 correlate exactly with trigger=heartbeat events that fire every ~15 minutes (HH:00:01, HH:15:25, HH:30:01, HH:45:25). Each one creates a parallel live claude-cli session that the gateway treats as eligible for inbound routing.

Subsequent user inbounds (trigger=user) log as reuse=reusable, but the gateway sometimes "reuses" the heartbeat-spawned session rather than the channel-bound one. Outcome:

User-message handed to fresh heartbeat session that has none of the channel-thread conversation history
That session generates a reply based only on the system prompt + MEMORY.md (whatever it had at boot) + the single user message
Reply lands on the user's WhatsApp / channel thread, looking like a forked / amnesiac Paige

The earliest [heartbeat] started log entry is 2026-05-04T14:33:11, so the heartbeat itself has been running for two weeks. Increased user-visibility of the fork pattern in the last 24 hours appears to correlate with heavier active channel use (more inbounds hitting the gateway during heartbeat-active windows).

Reproducer

Difficult to reproduce intentionally without active channel traffic. Easy to reproduce by:

Start a long-running active conversation on a channel (WhatsApp DM is the easiest; any channel that holds a live claude-cli session will work)
Send tool-heavy messages that take the channel session through tool calls — keeps activeSessions=1 busy on the channel binding
Wait until a 15-minute heartbeat boundary fires (HH:00, HH:15, HH:30, HH:45)
Within 30-60 seconds after the heartbeat, send a follow-up user inbound on the channel
Some percentage of these will route to the heartbeat-spawned session and reply with stale/empty context

Today's session at +447736454506 caught this four+ times between 19:01 BST and 22:40 BST — empirically the routing-to-heartbeat-session decision appears to happen every other heartbeat cycle or so.

Live process snapshot evidence

PID 59742  PPID 18019  age 31m50s  claude --resume e0a54390-…  ← WhatsApp DM session
PID 60488  PPID 18019  age  1m39s  claude --resume c3f55782-…  ← heartbeat-spawned fork

PID 60488 was generating fork replies on Dan's WhatsApp DM, on a session ID that doesn't appear in any expected channel binding. Manually kill -TERM 60488 stopped the fork instances temporarily. New heartbeats re-spawned parallel sessions later.

Suggested fixes

In priority order:

Bind heartbeat-spawned sessions to a dedicated session-key that is NEVER eligible for channel inbound routing. A agent:main:heartbeat key, or similar, that the gateway explicitly excludes from reuse=reusable matching when dispatching user inbounds from channel sessions. The current "any live session can be reused" routing is too permissive.
Strict channel-session affinity for user-inbounds. When trigger=user and the inbound source is a channel binding (whatsapp:direct, imessage:direct, etc.), the gateway should ONLY consider live sessions whose sessionKey matches the channel binding. If the channel session is mid-turn, queue the inbound on that session's input stream; do not fall back to ANY other live session, heartbeat-spawned or otherwise.
Diagnostic logging at the routing decision point. Emit a structured log line at the moment the gateway decides which live session to dispatch a user inbound to: source channel key, target sessionKey, reason, alternative options considered. Even without the behavioural fix, this would let operators detect mis-routings immediately rather than catching them via user complaint.
OPENCLAW_DISABLE_HEARTBEAT=1 env flag — an explicit opt-out for users who would rather lose heartbeat functionality than risk fork replies in production channel threads. Useful as a temporary mitigation while the proper fix is rolled.
Hard-stop on duplicate concurrent processes for the same effective outbound recipient — if two live sessions exist whose deliveries would both terminate at the same WhatsApp / iMessage JID, the gateway should kill the newer one and log.

Workaround in place locally

Accept that fork replies will appear at ~15-minute boundaries and ignore them as transient
The legitimate channel session typically replies a few seconds later with full context
Critical edits / publishes should be confirmed by checking the most-recently-updated reply
sessions.json cleanups did NOT resolve the issue (confirmed empirically); this workaround is the only remaining mitigation until upstream fix

Open questions for OpenClaw maintainers

Is heartbeat-spawned claude live session eligibility for channel inbound routing intentional or a regression?
What is the routing decision logic in agent/cli-backend for trigger=user inbounds when activeSessions > 1?
Would a OPENCLAW_DISABLE_HEARTBEAT env flag be acceptable as a short-term mitigation while the routing fix is being prepared?

Reference

Previous bug filed same week: #83491 (WhatsApp runtime regression on 2026.5.12) — fixed in cce0049 (PR #83647), awaiting tagged release
Memory rule pinning the version: feedback_openclaw_version_lock.md (stay on 2026.5.3-1)
Affected workflow: live Paige WhatsApp DM, gateway ~/.openclaw/logs/gateway.log (2026-05-19 evidence collected end-to-end)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#conversation history #embedding generation #cache error #pipeline error #runtime error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Heartbeat-spawned claude live session captures channel user inbounds, causing context-amnesiac fork replies [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Actual root cause (confirmed via gateway log analysis)

Fix Action

Fix / Workaround

Code Example

Bug report — Heartbeat trigger spawns a parallel `claude live session` that captures user inbounds, causing duplicated / conflicting replies on a busy channel session

Summary

What we initially thought (incorrect)

Actual root cause (confirmed via gateway log analysis)

Reproducer

Live process snapshot evidence

Suggested fixes

Workaround in place locally

Open questions for OpenClaw maintainers

Reference

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Heartbeat-spawned claude live session captures channel user inbounds, causing context-amnesiac fork replies [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Actual root cause (confirmed via gateway log analysis)

Fix Action

Fix / Workaround

Code Example

Bug report — Heartbeat trigger spawns a parallel claude live session that captures user inbounds, causing duplicated / conflicting replies on a busy channel session

Summary

What we initially thought (incorrect)

Actual root cause (confirmed via gateway log analysis)

Reproducer

Live process snapshot evidence

Suggested fixes

Workaround in place locally

Open questions for OpenClaw maintainers

Reference

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Bug report — Heartbeat trigger spawns a parallel `claude live session` that captures user inbounds, causing duplicated / conflicting replies on a busy channel session