openclaw - 💡(How to fix) Fix Inbound routing and restart-sentinel resume reattach to status:done sessions [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

After /new is issued, the session is marked status: "done" with endedAt set, but the entry remains in sessions.json under the original routing key. If a gateway restart occurs while a continuationMessage is queued — or any subsequent inbound arrives — the same done entry is reused instead of a new session being spawned. The restart-sentinel resume path also writes to the done session directly, refreshing updatedAt and effectively "resurrecting" it. Downstream this triggers preflight-compaction failures and channel-visible "Something went wrong" errors that persist until the session entry is manually removed.

Root Cause

Root cause (two defects)

Fix Action

Fixed

RAW_BUFFERClick to expand / collapse

Summary

After /new is issued, the session is marked status: "done" with endedAt set, but the entry remains in sessions.json under the original routing key. If a gateway restart occurs while a continuationMessage is queued — or any subsequent inbound arrives — the same done entry is reused instead of a new session being spawned. The restart-sentinel resume path also writes to the done session directly, refreshing updatedAt and effectively "resurrecting" it. Downstream this triggers preflight-compaction failures and channel-visible "Something went wrong" errors that persist until the session entry is manually removed.

Evidence (Telegram group, topic 17)

  • commands.log: /new issued at 2026-05-25T16:19:05Z against sessionKey=agent:openclaw-admin:telegram:group:<id>:topic:17.
  • sessions.json snapshot for same key: status: "done", endedAt: 1779726659123 (13:30:59 SP) — set 11 min after /new.
  • Gateway restart at ~13:30:42 SP between the two events.
  • Subsequent inbound (13:47:22 SP) still routed to the same done sessionId, then failed compaction with Preflight compaction required but failed: no real conversation messages.
  • Snapshot showed no cliSessionId and no continuationMessage field at read time (sentinel had already been consumed and cleared).

Root cause (two defects)

1. evaluateSessionFreshness ignores terminal status

dist/session-DUinUmLM.js:188 calls it via resolveSession; the function (dist/reset-B0OJOtNI.js:35-48) consults only updatedAt/sessionStartedAt/lastInteractionAt. A done/failed/killed/timeout entry whose timestamps are inside the daily/idle window is treated as fresh.

Note: isTerminalSessionStatus exists in dist/session-utils-CRKr-5AU.js:170 returning true for done|failed|killed|timeout — but it is not consulted by routing.

2. Restart-sentinel startup task re-binds to a done session without a status guard

loadRestartSentinelStartupTask in dist/server-restart-sentinel-zgIzZ1w7.js:582-720:

```js const { cfg, entry, canonicalKey } = loadSessionEntry(sessionKey); … continuationQueueId = await enqueueSessionDelivery(buildQueuedRestartContinuation({ sessionKey: canonicalKey, continuation: payload.continuation, … })); ```

There is no check that entry.status !== 'done' or entry.endedAt == null before enqueueing the continuation. This both delivers an agentTurn continuation against the closed session and bumps updatedAt, which then defeats freshness checks for later inbounds.

3. /new only flips status; the routing key is not unlinked

See e.g. dist/get-reply-DOTqK3jN.js:2642-2644, 3338, 4516 and the this.status = "done" site in dist/server-methods-90LGtoqF.js:16227. The key continues to satisfy resolveSessionKey(scope, ctx, mainKey, storeAgentId) in dist/session-DUinUmLM.js:130.

Reproduce

  1. Send /new to a Telegram topic session.
  2. While endedAt is set on that entry, trigger (or wait for) a gateway restart whose sentinel payload includes the same sessionKey with a continuation.
  3. After restart, observe the done entry's updatedAt advance and a continuation delivered against it.
  4. Send any further inbound to that topic; it reuses the same sessionId.

Suggested fix

  • In resolveSession (dist/session-DUinUmLM.js), treat any entry where isTerminalSessionStatus(entry.status) or typeof entry.endedAt === 'number' as not fresh, regardless of timestamps. Force isNewSession=true and a new sessionId.
  • In loadRestartSentinelStartupTask (dist/server-restart-sentinel-zgIzZ1w7.js), before enqueueing the continuation, check the loaded entry; if terminal, either drop the continuation, route it via a freshly spawned session, or route it as a system notice only (no agentTurn resume).
  • When /new (or /reset) marks a session terminal, also rotate the routing index entry (e.g. archive under a sessionId-suffixed key) so the canonical to-keyed lookup misses on the next inbound.
  • Optionally: include the session's status snapshot in the restart-sentinel payload at write time, and refuse to resume if status was already terminal.

Affected symbols

  • dist/session-DUinUmLM.jsresolveSession, resolveSessionKeyForRequest (no terminal-status guard).
  • dist/reset-B0OJOtNI.jsevaluateSessionFreshness (timestamps only).
  • dist/server-restart-sentinel-zgIzZ1w7.jsloadRestartSentinelStartupTask, recoverPendingRestartContinuationDeliveries (re-binds to a done session without status check).
  • dist/restart-sentinel-BDtpdKoy.jsbuildRestartSuccessContinuation, writeRestartSentinel.
  • dist/session-utils-CRKr-5AU.js:170isTerminalSessionStatus exists but is unused by routing.
  • dist/get-reply-DOTqK3jN.js:2642-2644,3338,4516/new handling: marks done but does not unlink/rename the routing key.
  • dist/server-methods-90LGtoqF.js:16227this.status = "done".

Related

Filed alongside a separate issue for inbound user messages not persisting to session JSONL when the agent attempt throws. That bug + this one combine to permanently wedge the session: throw → empty jsonl → over-cap → preflight fails forever because routing won't release the key.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING