openclaw - ✅(Solved) Fix heartbeat: isolatedSession: true silently reuses the same transcript file across every run [4 pull requests, 1 participants]

alexander-applyinnovations · 2026-04-11T12:18:05Z

[openclaw] agents.defaults.heartbeat.isolatedSession: true is documented as producing a fresh session with a new sessionId and an empty transcript on every hea… `agents.defaults.heartbeat.isolatedSession: true` is documented as producing a fresh session (with a new `sessionId` and an empty transcript) on every heartbeat run, but in practice it only rolls the `sessionId` in the store entry — the persisted `sessionFile` path is preserved via a spread, so every run keeps appending to the same physical transcript file forever. Over time the file accumulates the full history of every prior heartbeat, and each new run sees all of it in its in-context window, which is the exact opposite of isolation. This is provable from the code alone without any production evidence. # PR #64797: fix(agents): clear sessionFile when rolling a fresh isolated session - Repository: openclaw/openclaw - Author: alexander-applyinnovations - State: closed | merged: False - Link: https://github.com/openclaw/openclaw/pull/64797 ## Description (problem / solution / changelog) ## Summary - **Problem:** `resolveCronSession` preserves the existing entry's `sessionFile` via `...entry` whenever a new `sessionId` is generated (forceNew or stale). `resolveSessionFilePath` prefers `entry.sessionFile` over `sessionId` when deciding where to write, so every new session keeps appending to the same physical transcript file forever. - **Why it matters:** For heartbeats configured with `isolatedSession: true` (which is how the docs describe the "fresh session per run" behavior), the transcript file grew unbounded across every run, poisoning each new run with the in-context history of all prior runs — the exact opposite of isolation. `lightContext: true`'s documented ~100K→~2–5K token savings silently regressed as the file grew. - **What changed:** One-line addition in `src/cron/isolated-agent/session.ts` — `sessionFile: undefined` in the existing `isNewSession` cleanup block (next to `lastChannel`, `lastTo`, `lastThreadId`, `deliveryContext`). The resolver now falls through to `resolveSessionTranscriptPathInDir(sessionId, …)` and produces a new file named after the new sessionId. - **What did NOT change:** Only the `isNewSession` cleanup block is touched. Non-`forceNew` reuse of fresh sessions continues to preserve `sessionFile` as before. Delivery routing clears, the `...entry` spread of overrides, and every other field stays intact. ## Change Type (select all) - [x] Bug fix ## Scope (select all touched areas) - [x] Gateway / orchestration - [x] Memory / storage ## Linked Issue/PR - Closes #64795 - Related #64196 - [x] This PR fixes a bug or regression ## Root Cause (if applicable) - **Root cause:** The `isNewSession` cleanup block in `resolveCronSession` clears delivery-routing fields but not `sessionFile`. Since the returned entry is built by spreading the old entry first, the stale `sessionFile` is always carried forward, and downstream `resolveSessionFilePath` / `resolveAndPersistSessionFile` both prefer a persisted `sessionFile` over recomputing from `sessionId`. Net effect: the logical session rolls but the physical transcript file never rotates, defeating `isolatedSession: true`. - **Missing detection / guardrail:** No regression test existed for rotating `sessionFile` when `forceNew` or stale reset generates a new `sessionId`. Existing tests cover delivery-routing clears (`lastChannel`, `deliveryContext`) but not filesystem path isolation. - **Contributing context:** `sessionFile` was (correctly) added to the `SessionEntry` persistence layer to keep the transcript path stable across reuse of fresh sessions. The `isNewSession` branch inherited the same preservation semantics without carving out forceNew/stale rotation. ## Regression Test Plan - Coverage level: **Unit test** (pure, no filesystem) - Target test file: `src/cron/isolated-agent/session.test.ts` - Scenarios the test locks in: 1. `clears sessionFile when forceNew is true` — asserts `result.sessionEntry.sessionFile` is `undefined` after `forceNew: true` against an entry with a populated `sessionFile`. 2. `clears sessionFile when session is stale` — same assertion on the stale-freshness branch (no `forceNew`, but `evaluateSessionFreshness` returns `{ fresh: false }`). 3. `preserves sessionFile when reusing fresh session` — asserts that a reused fresh session still carries its `sessionFile` unchanged, so this fix doesn't regress the normal reuse path. - Why: these three cases cover every branch of `resolveCronSession` where the entry's `sessionFile` matters, and they're the smallest pure-unit guardrails that would have caught the original bug. ## User-visible / Behavior Changes For agents with `isolatedSession: true`, each heartbeat run will now correctly write to a new transcript file named after the current run's sessionId. The old frozen transcript file will be orphaned on disk and cleaned up by the session reaper on its next pass (or can be removed manually). ## Diagram ```text Before

openclaw2026-04-11 12:18:05

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#64795•Fetched 2026-04-12 13:26:43

View on GitHub

Comments

Participants

Timeline

Reactions

Author

alexander-applyinnovations

Participants

alexander-applyinnovations

Timeline (top)

cross-referenced ×4referenced ×1

agents.defaults.heartbeat.isolatedSession: true is documented as producing a fresh session (with a new sessionId and an empty transcript) on every heartbeat run, but in practice it only rolls the sessionId in the store entry — the persisted sessionFile path is preserved via a spread, so every run keeps appending to the same physical transcript file forever. Over time the file accumulates the full history of every prior heartbeat, and each new run sees all of it in its in-context window, which is the exact opposite of isolation.

This is provable from the code alone without any production evidence.

Root Cause

Because the entry always has a sessionFile after the first-ever run, the if (candidate) branch is taken and the function never falls through to computing a fresh path from sessionId.

Fix Action

Fix / Workaround

isolatedSession: true silently does nothing after the first run. Every existing deployment that relies on it has been running without isolation.
lightContext: true's documented token savings are optimistic by ~20x. The ~100K → ~2-5K figure only holds on the first run; every subsequent run incurs the full accumulated transcript.
Model behavior drifts toward the reinforced pattern. Any acknowledgment, summary, or tool-call sequence from an early run gets few-shot-learned by later runs and becomes sticky — even after config or prompt changes intended to alter the behavior.
Context-overflow cliff. On a long-running deployment the file eventually exceeds the model's context window. Compaction on a transcript that is mostly tool-call noise fires compaction-safeguard: no real conversation messages to summarize and only writes a boundary marker, giving little real relief.
No user-visible workaround. There is no chat/CLI command that resets a non-current session, so affected users have to rm the file by hand from the pod's filesystem.

PR fix notes

PR #64797: fix(agents): clear sessionFile when rolling a fresh isolated session

Repository: openclaw/openclaw
Author: alexander-applyinnovations
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/64797

Description (problem / solution / changelog)

Summary

Problem: resolveCronSession preserves the existing entry's sessionFile via ...entry whenever a new sessionId is generated (forceNew or stale). resolveSessionFilePath prefers entry.sessionFile over sessionId when deciding where to write, so every new session keeps appending to the same physical transcript file forever.
Why it matters: For heartbeats configured with isolatedSession: true (which is how the docs describe the "fresh session per run" behavior), the transcript file grew unbounded across every run, poisoning each new run with the in-context history of all prior runs — the exact opposite of isolation. lightContext: true's documented ~100K→~2–5K token savings silently regressed as the file grew.
What changed: One-line addition in src/cron/isolated-agent/session.ts — sessionFile: undefined in the existing isNewSession cleanup block (next to lastChannel, lastTo, lastThreadId, deliveryContext). The resolver now falls through to resolveSessionTranscriptPathInDir(sessionId, …) and produces a new file named after the new sessionId.
What did NOT change: Only the isNewSession cleanup block is touched. Non-forceNew reuse of fresh sessions continues to preserve sessionFile as before. Delivery routing clears, the ...entry spread of overrides, and every other field stays intact.

Change Type (select all)

Bug fix

Scope (select all touched areas)

Gateway / orchestration
Memory / storage

Linked Issue/PR

Closes #64795
Related #64196
This PR fixes a bug or regression

Root Cause (if applicable)

Root cause: The isNewSession cleanup block in resolveCronSession clears delivery-routing fields but not sessionFile. Since the returned entry is built by spreading the old entry first, the stale sessionFile is always carried forward, and downstream resolveSessionFilePath / resolveAndPersistSessionFile both prefer a persisted sessionFile over recomputing from sessionId. Net effect: the logical session rolls but the physical transcript file never rotates, defeating isolatedSession: true.
Missing detection / guardrail: No regression test existed for rotating sessionFile when forceNew or stale reset generates a new sessionId. Existing tests cover delivery-routing clears (lastChannel, deliveryContext) but not filesystem path isolation.
Contributing context: sessionFile was (correctly) added to the SessionEntry persistence layer to keep the transcript path stable across reuse of fresh sessions. The isNewSession branch inherited the same preservation semantics without carving out forceNew/stale rotation.

Regression Test Plan

Coverage level: Unit test (pure, no filesystem)
Target test file: src/cron/isolated-agent/session.test.ts
Scenarios the test locks in:
1. clears sessionFile when forceNew is true — asserts result.sessionEntry.sessionFile is undefined after forceNew: true against an entry with a populated sessionFile.
2. clears sessionFile when session is stale — same assertion on the stale-freshness branch (no forceNew, but evaluateSessionFreshness returns { fresh: false }).
3. preserves sessionFile when reusing fresh session — asserts that a reused fresh session still carries its sessionFile unchanged, so this fix doesn't regress the normal reuse path.
Why: these three cases cover every branch of resolveCronSession where the entry's sessionFile matters, and they're the smallest pure-unit guardrails that would have caught the original bug.

User-visible / Behavior Changes

For agents with isolatedSession: true, each heartbeat run will now correctly write to a new transcript file named after the current run's sessionId. The old frozen transcript file will be orphaned on disk and cleaned up by the session reaper on its next pass (or can be removed manually).

Diagram

Before:
[heartbeat t1] -> resolveCronSession(forceNew:true)
                  -> sessionId = uuid-A
                  -> entry = {...oldEntry, sessionFile: oldPath, sessionId: uuid-A}
                  -> transcript writer appends to oldPath

[heartbeat t2] -> resolveCronSession(forceNew:true)
                  -> sessionId = uuid-B (new!)
                  -> entry = {...t1Entry, sessionFile: oldPath, sessionId: uuid-B}
                  -> transcript writer appends to oldPath (same file!)

...many runs...
                  -> file contains all historical messages
                  -> model sees all prior HEARTBEAT replies as in-context history
                  -> few-shot-learns the pattern, reinforces it forever

After:
[heartbeat t1] -> resolveCronSession(forceNew:true)
                  -> sessionId = uuid-A
                  -> entry = {...oldEntry, sessionFile: undefined, sessionId: uuid-A}
                  -> resolver falls through to resolveSessionTranscriptPathInDir
                  -> transcript writer creates/appends uuid-A.jsonl

[heartbeat t2] -> resolveCronSession(forceNew:true)
                  -> sessionId = uuid-B
                  -> entry = {...t1Entry, sessionFile: undefined, sessionId: uuid-B}
                  -> resolver falls through again
                  -> transcript writer creates/appends uuid-B.jsonl  (fresh file!)

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No

No risk.

Repro + Verification

Environment

OS: Linux (registry.xlab.now/clankertron:2026.4.10 on Kubernetes, Talos 1.11.2)
Runtime/container: Node.js via the openclaw container image (upstream base ghcr.io/openclaw/openclaw)
Model/provider: llamacpp-deep/qwen3.5-35b-a3b (backed by llama.cpp HTTP server, ctx_size=65536)
Integration/channel: Telegram heartbeat target, 15-minute interval

Relevant config:

agents: {
  defaults: {
    heartbeat: {
      every: "15m",
      isolatedSession: true,
      lightContext: true,      // or false — both reproduce
      session: "main",         // default
      target: "telegram",
    }
  }
}

Steps (without this fix)

Run a heartbeat agent with isolatedSession: true for any length of time (multiple heartbeats).
cat /data/agents/main/sessions/sessions.json and inspect the agent:main:main:heartbeat entry.
Note that sessionId rolls on every run but sessionFile is stable.
Inspect the transcript file at the path in sessionFile.
Observe that grep -c '"type":"session"' <file> returns 1 — only one session record even though many runs have occurred.
Observe that the file grows monotonically and accumulates the history of every heartbeat run.

Expected

Each forceNew heartbeat should produce a new transcript file (or at least a new session record in a rotated file), with no in-context history from prior runs.

Actual (before fix)

The same transcript file is reused across every run. The model sees ~100K tokens of accumulated prior-run history on every heartbeat. "Fresh session per run" is not achieved for any run past the first.

Actual (with this fix)

After the first heartbeat post-fix, the store entry's sessionFile is cleared on each forceNew run. resolveSessionFilePath falls through and allocates a new <sessionId>.jsonl file. Each run's transcript is isolated.

Evidence

Forensic data from a live deployment

sessions.json for agent:main:main:heartbeat on a live cluster (five hours after the pod restarted):

{
  "sessionId": "8f939e30-e4a3-479f-9eeb-2b21d4aaf57b",
  "sessionFile": "/data/agents/main/sessions/34db8152-a3ac-4e7a-8c4a-9a38d9525339.jsonl",
  "updatedAt": 1775908689677,
  "heartbeatIsolatedBaseSessionKey": "agent:main:main"
}

File on disk:

metric	value
filename stem	`34db8152-a3ac-4e7a-8c4a-9a38d9525339`
transcript header session id	`cbc883fc-6486-40d2-b0d1-6a102861f5df` (first run ever, `2026-04-10T22:09:51Z`)
distinct session records in file	1
lines	433
HEARTBEAT_OK ack tokens	101
oldest message	`2026-04-10T22:09:51Z`
newest message	`2026-04-11T11:58:09Z`

Three distinct UUIDs observable for a single logical session:

34db8152… — filename (set once, never rotated)
cbc883fc… — transcript-file session id (from the very first run)
8f939e30… — current sessionId in the store (latest forceNew)

Log correlation (different `sessionId`s, same `sessionFile`)

From context-overflow diagnostics in the same deployment:

07:28  sessionId=da41fec7-997f-4a56-b5b5-a56cb3a11c28  sessionFile=…/34db8152-…jsonl
10:43  sessionId=f198e48b-84e6-4695-aabc-c4fed74d7cd1  sessionFile=…/34db8152-…jsonl
10:59  sessionKey=agent:main:main:heartbeat            sessionFile=…/34db8152-…jsonl  messages=97

4+ distinct sessionId values observed across 5 hours on the same sessionKey and the same sessionFile. The other 13 non-heartbeat session entries in the same store all have their filename stem matching their sessionId — only the heartbeat entry shows the mismatch, confirming the bug is scoped to the forceNew branch.

Provable from code alone

The bug is also provable purely by code walk:

resolveSessionFilePath(sessionId, entry) at src/config/sessions/paths.ts:263 prefers entry.sessionFile over the sessionId-derived path.
resolveAndPersistSessionFile at src/config/sessions/session-file.ts:17-27 only consults fallbackSessionFile when !baseEntry.sessionFile.
The return entry from resolveCronSession is built by spreading ...entry (which carries sessionFile) and the isNewSession cleanup block does not include sessionFile.
Therefore, after the first run of any session that ever goes through resolveCronSession, the stale sessionFile is returned on every subsequent forceNew/stale rotation.

Human Verification (required)

Verified scenarios:
- Read every caller of resolveSessionFilePath and resolveAndPersistSessionFile in src/ to confirm no downstream path re-derives the file from the new sessionId after resolveCronSession returns.
- Read the heartbeat-runner flow from resolveCronSession through saveSessionStore and the downstream transcript writer to confirm no intermediate reset exists.
- Dumped sessions.json from a live cluster and checked all 14 entries — 13 non-heartbeat sessions have correctly matching sessionId↔filename, only the heartbeat entry has the mismatch, confirming the bug is scoped to forceNew only.
- Verified the file header's session id (cbc883fc) differs from the store's current sessionId (8f939e30) and the filename stem (34db8152) — three distinct UUIDs for one logical session, exactly as the code predicts.
- Verified that the same sessionFile appears in log lines with four different sessionId values over 5 hours.
- Verified that sessionFile?: string in SessionEntry type allows the undefined assignment; TypeScript is satisfied.
Edge cases checked:
- First-ever run (no existing entry) — entry is undefined, spread is a no-op, sessionFile is undefined by default, resolver computes from sessionId. Unchanged by this fix.
- Reused fresh session (!forceNew && fresh) — the isNewSession cleanup block is skipped, sessionFile is preserved via the spread. Unchanged and covered by the new preserves sessionFile when reusing fresh session test.
- Stale rotation (!forceNew && !fresh) — now also clears sessionFile (was broken before). New test covers this.
- Non-heartbeat forceNew callers (webhook cron runs via the same resolveCronSession) — they get the same fix, which is also the documented behavior for sessionTarget: "isolated".
What I did NOT verify:
- Actual vitest run of the new tests. This machine has no node/pnpm available, so the tests were written by hand against the existing session.test.ts conventions (same resolveWithStoredEntry helper, same mock shape). I am relying on CI to confirm the suite passes. Targeted test file: src/cron/isolated-agent/session.test.ts.
- End-to-end live reproduction of the fix against our cluster — that requires a rebuild of the downstream clankertron image, which I have not done as part of this PR.

Compatibility / Migration

Backward compatible? Yes — the fix only changes behavior when isNewSession is true, which is exactly when the current behavior is wrong. No existing working path is altered.
Config/env changes? No
Migration needed? No for configuration, but operators of affected deployments will have a stale transcript file on disk that is no longer referenced. The session reaper should clean these up on its next pass. Manual cleanup is also safe (rm the orphaned file once the sessions store no longer references it).

Risks and Mitigations

Risk: Orphaned transcript files for existing deployments where the stale sessionFile path gets dropped from the store entry on the first post-fix run.
- Mitigation: The files are only orphaned on disk; the session reaper (disk-budget maintenance) will reclaim them on its next pass. No user-visible regression. Operators can also manually delete the old transcript file — it has no operational value after the fix lands.
Risk: Any code path that still reads a session entry's sessionFile and expects it to be non-empty may see undefined on the first post-fix turn before the downstream resolveAndPersistSessionFile runs.
- Mitigation: SessionEntry.sessionFile is already declared optional (sessionFile?: string) in src/config/sessions/types.ts. All call sites I inspected use optional chaining or explicit null-checks. No TypeScript errors surfaced from the change.

AI-assisted

Drafted with Claude Code (Claude Opus 4.6, 1M context)
Lightly tested — the tests were written but NOT run locally due to a missing Node toolchain on the authoring machine. Relying on CI for the targeted test run.
I understand what the code does — one-line addition to an existing conditional-clear block, plus three targeted regression tests.

🤖 Generated with Claude Code

Changed files

src/cron/isolated-agent/session.test.ts (modified, +58/-0)
src/cron/isolated-agent/session.ts (modified, +7/-0)

PR #64808: fix(agents): archive rotated heartbeat transcript on isolatedSession rotation

Repository: openclaw/openclaw
Author: alexander-applyinnovations
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/64808

Description (problem / solution / changelog)

Summary

Problem: Even with the sessionFile clear from #64797, when an isolatedSession: true heartbeat rotates to a new session the PRIOR transcript file at the old path becomes orphaned — referenced by nothing in the session store. The only mechanism that reaps it today is enforceSessionDiskBudget in config/sessions/disk-budget.ts, which runs only when the budget is exceeded. On a deployment with a 15-minute heartbeat interval, orphaned transcripts accumulate to hundreds of MB before any cleanup happens.
Why it matters: Without immediate archival, operators see unbounded disk growth in the agent sessions directory even though each logical session is now correctly isolated. The file count and disk footprint still grow monotonically per heartbeat tick.
What changed: In src/infra/heartbeat-runner.ts, capture the prior entry's (sessionId, sessionFile) pair at the isolated-session key before the store update, feed it into a new resetSessionFiles map, and archive via archiveRemovedSessionTranscripts with reason: "reset" (rename to <file>.reset.<ts>, cleaned up later by cleanupArchivedSessionTranscripts after its retention window). Existing suffix-collapse case is split into a deletedSessionFiles map so it continues to use reason: "deleted" — the two cases have different semantics and should get different retention classes.
What did NOT change (scope boundary): No changes to resolveCronSession or session.ts (that's #64797). No changes to archiveRemovedSessionTranscripts, archiveSessionTranscripts, or cleanupArchivedSessionTranscripts — reusing the existing battle-tested archival path. No new retention knobs — uses the existing maintenance.resetArchiveRetentionMs for rotation archives. The runSessionKey = isolatedSessionKey; assignment and everything downstream of the saveSessionStore call is untouched.

Change Type (select all)

Bug fix

Scope (select all touched areas)

Gateway / orchestration
Memory / storage

Linked Issue/PR

Closes #64795 (partially — the sessionFile-clear fix is #64797; this PR addresses the orphan-accumulation follow-on)
Related #64797 — this PR depends on #64797 landing first. Without the sessionFile clear, the heartbeat runner sees priorEntry.sessionFile still matching the inherited path after resolveCronSession returns, and archiving it would leave the store entry pointing at a renamed file. The dependency is purely ordering: this PR's logic is correct once #64797 is in.
This PR fixes a bug or regression

Root Cause (if applicable)

Root cause: resolveCronSession rotates to a new sessionId on every forceNew: true heartbeat run, and once #64797 lands it also rotates to a new sessionFile. The prior transcript file at the old path is then referenced by no store entry — the runtime has no code path that proactively archives it. The only cleanup today is disk-budget enforcement, which is a last-resort mechanism, not an incremental one.
Missing detection / guardrail: There's no test that verifies "on rotation, the prior transcript is cleaned up". Existing heartbeat tests verify session-key stability and delivery routing but not file lifecycle.
Contributing context: The existing archiveRemovedSessionTranscripts call in heartbeat-runner.ts:841 already handled the stale :heartbeat:heartbeat suffix-collapse case with reason: "deleted". Extending it to also archive rotation files is a natural extension of that pattern — the archival primitive is the same, only the input set and the semantics (reset vs deleted) differ.

Regression Test Plan

Coverage level: Unit test (sandbox-based, exercises the full heartbeat runner with a temp session store)
Target test file: src/infra/heartbeat-runner.isolated-key-stability.test.ts
New test: archives the prior transcript file as .reset when rotating to a fresh isolated session
Scenario the test locks in:
1. Seed a session store with an isolatedSessionKey entry whose sessionFile points at an existing transcript on disk
2. Run runHeartbeatOnce
3. Assert the prior transcript file has been renamed to <id>.jsonl.reset.<ts> (no longer at the original path)
4. Assert the store entry still exists at the same key but with a different sessionId
5. Assert the new sessionFile (if defined) is different from the old one
Why: this is the smallest reliable end-to-end test that locks in the archive-on-rotation contract and catches regressions if any path in heartbeat-runner.ts bypasses the rotation archival.

User-visible / Behavior Changes

On deployments with isolatedSession: true, rotated transcript files now get immediately archived as <file>.reset.<ts> instead of sitting on disk indefinitely. The archival is reversible within the configured retention window (maintenance.resetArchiveRetentionMs). Operators who debugged prior heartbeat runs by reading the stable transcript file will instead find the archived rotations with timestamped suffixes, in chronological order.

Diagram

Before #64797:
[heartbeat t1] → transcript appended to sessions/foo.jsonl
[heartbeat t2] → same file, sessionId rolls in store but file keeps growing
[...many runs...] → one file, 100+ HEARTBEAT_OK replies poisoning each new run

After #64797 only:
[heartbeat t1] → sessions/sid-A.jsonl (fresh)
[heartbeat t2] → sessions/sid-B.jsonl (fresh, A is orphaned)
[heartbeat t3] → sessions/sid-C.jsonl (fresh, A+B orphaned)
[...many runs...] → N orphaned files, cleaned up eventually by disk-budget sweeper

After this PR (stacked on #64797):
[heartbeat t1] → sessions/sid-A.jsonl
[heartbeat t2] → sessions/sid-B.jsonl + sid-A.jsonl.reset.<t2-ts>
[heartbeat t3] → sessions/sid-C.jsonl + sid-B.jsonl.reset.<t3-ts>
                 (sid-A archive eventually cleaned by cleanupArchivedSessionTranscripts
                  after resetArchiveRetentionMs)

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No

No risk. The archival path is the same one already used by the suffix-collapse case and by the cron reaper — the change here is only when it runs and what input it's given. restrictToStoreDir: true is preserved.

Repro + Verification

Environment

OS: Linux
Runtime/container: Node.js via the openclaw container image
Model/provider: any
Integration/channel: any heartbeat target

Relevant config:

agents: {
  defaults: {
    heartbeat: {
      every: "15m",
      isolatedSession: true,
    }
  }
}

Steps (after #64797 lands, without this PR)

Run a heartbeat agent with isolatedSession: true for any number of heartbeats (>2).
ls /data/agents/main/sessions/*.jsonl — observe a growing number of orphaned transcript files, one per prior heartbeat run, none referenced by the sessions.json store.
Wait for enforceSessionDiskBudget to run only when the budget is exceeded — typically hours to days to weeks depending on maxDiskBytes.

Expected

Each rotation should immediately archive the prior transcript so disk usage stays bounded by the retention window, not by the disk budget.

Actual (without this PR)

Orphaned files accumulate. On a 15-min interval deployment, ~100 orphaned files per day, each 100-1000 bytes to kilobytes.

Actual (with this PR)

Each rotation immediately renames the prior file to <file>.reset.<ts>. After maintenance.resetArchiveRetentionMs elapses, the archive is cleaned up by cleanupArchivedSessionTranscripts on the next maintenance sweep.

Evidence

New test archives the prior transcript file as .reset when rotating to a fresh isolated session asserts the end-to-end behavior: prior transcript gone from its original path, archived file present at <id>.jsonl.reset.<ts>, store entry rotated to a new sessionId.
Existing test coverage (heartbeat-runner.isolated-key-stability.test.ts) verifies the suffix-collapse case still works with its own deletedSessionFiles map and reason: "deleted".

Human Verification (required)

Verified scenarios:
- Traced cronSession.store[isolatedSessionKey] before the store[isolatedSessionKey] = cronSession.sessionEntry assignment. At that point, the old entry with old sessionId and old sessionFile is still there, so priorEntryAtKey captures exactly the values we want to archive.
- Verified that referencedSessionIds (computed from Object.values(cronSession.store) AFTER the new entry is assigned) contains the NEW sessionId but not the OLD one, so archiveRemovedSessionTranscripts will not skip the archival for the old file.
- Confirmed that the existing suffix-collapse case continues to use reason: "deleted" and is gated on staleIsolatedSessionKey being set, completely independent of the new rotation logic.
- Read archiveSessionTranscriptsDetailed to confirm restrictToStoreDir: true constrains archival to the agent sessions dir via path.relative containment check — no risk of touching unrelated files.
Edge cases checked:
- First-ever heartbeat run (no prior entry): priorEntryAtKey is undefined, no addition to resetSessionFiles, nothing to archive. Works.
- Prior entry exists but has no sessionFile set: conditional priorEntryAtKey.sessionFile check skips the add. Nothing archived. Works.
- Both staleIsolatedSessionKey AND isNewSession trigger on the same run: both maps get populated, both archival calls run, each uses its own reason. The two maps are disjoint by construction (different sessionIds).
- Reused fresh session (isNewSession === false): the new conditional is skipped, no rotation archival happens, sessionFile continues to be used by the same entry. Works (the reuse path doesn't rotate the transcript).
What I did NOT verify:
- Actual pnpm vitest run of the new test. This machine has no Node toolchain available, so I wrote the test by hand against the existing withTempHeartbeatSandbox convention and trust the harness. Targeted test file: src/infra/heartbeat-runner.isolated-key-stability.test.ts.
- End-to-end verification on a live cluster. The cluster behavior can be observed after both PRs land and the image is rebuilt.

Compatibility / Migration

Backward compatible? Yes — the new behavior only kicks in when isNewSession === true, which already existed but had no archival. No existing working path is altered.
Config/env changes? No — reuses existing maintenance.resetArchiveRetentionMs / maintenance.pruneAfterMs knobs.
Migration needed? For existing deployments, orphaned transcript files from before both PRs land will still sit on disk until the first maintenance sweep after the upgrade. They're safe to rm manually if disk pressure is urgent. Everything from the first post-upgrade heartbeat onwards rotates cleanly.

Risks and Mitigations

Risk: If #64797 does not land before this PR, priorEntryAtKey.sessionFile is still inherited by the new entry via ...entry spread, and archiveRemovedSessionTranscripts would rename a file that the new entry is about to write to. On the next write, the store entry's sessionFile would point at a file that no longer exists at that path.
- Mitigation: This PR is explicitly documented as depending on #64797. Do not merge this PR before #64797. The branch is stacked on top of the #64797 branch for exactly this reason.
Risk: reason: "reset" uses maintenance.resetArchiveRetentionMs for cleanup, which may be shorter than pruneAfterMs. Operators who relied on longer retention for debugging may find archives gone sooner.
- Mitigation: "reset" is semantically correct for rotation (the entry persists, only the transcript rolls). Operators who want longer retention can tune maintenance.resetArchiveRetentionMs directly — it's an existing knob with existing docs.

AI-assisted

Drafted with Claude Code (Claude Opus 4.6, 1M context)
Lightly tested — the test was written but NOT run locally due to a missing Node toolchain on the authoring machine. Relying on CI for the targeted test run.
I understand what the code does — ~15 lines of additions to heartbeat-runner.ts plus a ~55-line end-to-end test. The archival primitive is unchanged; only the input set and the reason classification are new.

🤖 Generated with Claude Code

Changed files

src/cron/isolated-agent/session.test.ts (modified, +58/-0)
src/cron/isolated-agent/session.ts (modified, +7/-0)
src/infra/heartbeat-runner.isolated-key-stability.test.ts (modified, +73/-0)
src/infra/heartbeat-runner.ts (modified, +52/-12)

PR #64832: fix(agents): archive orphaned isolated-session transcripts after rotation

Repository: openclaw/openclaw
Author: alexander-applyinnovations
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/64832

Description (problem / solution / changelog)

Summary

Complement to #65203. That PR cleared sessionFile on rotation so each isolatedSession: true run writes to a fresh path — but the prior transcript file at the old path is now orphaned: nothing in the store references it, and cleanupArchivedSessionTranscripts only scans for the .reset.<ts> suffix. Orphans accumulate forever.

This PR renames each prior transcript to <file>.reset.<ts> as part of the same rotation transaction, so it re-enters the retention window.

Change

Two helpers in src/cron/isolated-agent/session.ts:

capturePriorIsolatedEntryForArchival — snapshot prior (sessionId, sessionFile) before persist.
archivePriorIsolatedEntryAfterRotation — rename to <file>.reset.<ts> after persist, with reason: "reset" (honors maintenance.resetArchiveRetentionMs). Uses the existing archiveRemovedSessionTranscripts primitive.

Wired into both forceNew: true call sites:

src/infra/heartbeat-runner.ts — heartbeat rotation path. Also splits the existing :heartbeat:heartbeat suffix-collapse archival into its own call with reason: "deleted" so the two archival paths get their correct retention classes (pruneAfterMs vs resetArchiveRetentionMs).
src/cron/isolated-agent/run-session-state.ts — cron createPersistCronSessionEntry closure with a once-flag (persist is called multiple times per run: pre-run, skills refresh, finalize). Covers the cron: prefix case (runSessionKey = ...:run:<id>), where the session-reaper only archives the run-key entry on retention — not the prior agentSessionKey transcript. Without this path the cron orphan is never archived.

Tests lock in: capture timing (BEFORE persist), once-flag, cron-run-key path, archival failure handling, and a heartbeat-runner sandbox test that seeds a real prior transcript and asserts the rename.

Forensic evidence

xlab.now deployment running clankertron:2026.4.10 (pre-#65203) with a manual sessionFile clear as workaround: five heartbeat runs over seven hours each wrote to a fresh transcript, but five orphaned transcripts accumulated on disk (300 KB – 1.3 MB each). cleanupArchivedSessionTranscripts cannot see them — they don't carry the .reset.* suffix. At ~4 runs/hour × ~400 KB this is roughly 38 MB/day of unsweepable growth per deployment. This PR closes that gap at the rotation boundary.

Scope boundary

No changes to archiveRemovedSessionTranscripts, cleanupArchivedSessionTranscripts, or any archival primitive — reuses existing machinery from a new call site.
No new retention knobs.
No changes to the pure-reuse (!isNewSession) path.
restrictToStoreDir: true preserved throughout.

Safety

Archival is wrapped in try/catch at each call site; failure logs a warning but does not fail the run.
The referencedSessionIds safety guard inside archiveRemovedSessionTranscripts prevents archiving any sessionId still pointed at by another store entry. Callers compute this set from the post-update store.

Prior state of this PR

Originally scoped as the root-cause fix for sessionFile persistence. #65203 landed that root-cause fix independently on 2026-04-12 (the sessionFile: undefined line in resolveCronSession). This PR has been rebased on top of #65203 and narrowed to the archival flow only — the diff no longer contains the sessionFile: undefined change, only the orphan-cleanup machinery that neither #65203 nor any current code path provides.

🤖 Generated with Claude Code

Changed files

src/cron/isolated-agent/run-session-state.test.ts (added, +345/-0)
src/cron/isolated-agent/run-session-state.ts (modified, +42/-1)
src/cron/isolated-agent/run.test-harness.ts (modified, +5/-0)
src/cron/isolated-agent/run.ts (modified, +6/-0)
src/cron/isolated-agent/session.test.ts (modified, +225/-1)
src/cron/isolated-agent/session.ts (modified, +65/-0)
src/infra/heartbeat-runner.isolated-key-stability.test.ts (modified, +115/-0)
src/infra/heartbeat-runner.ts (modified, +42/-7)

PR #64873: fix(cron): clear sessionFile on forceNew so isolated runs don't share transcripts

Repository: openclaw/openclaw
Author: mjamiv
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/64873

Description (problem / solution / changelog)

Summary

Add sessionFile: undefined to the isNewSession cleanup block in resolveCronSession so that forced-new isolated runs don't inherit and keep writing to the previous transcript file.
Add three targeted tests in src/cron/isolated-agent/session.test.ts covering forceNew, stale-session, and fresh-reuse paths.

Fixes #64795.

Root cause

resolveCronSession in src/cron/isolated-agent/session.ts rolls a new sessionId when forceNew: true (or when the stored session is stale), but it builds the returned entry by spreading the previous entry first:

const sessionEntry: SessionEntry = {
  ...entry,                 // ← spreads sessionFile from prior entry
  sessionId,                // ← overridden with the new uuid
  updatedAt: params.nowMs,
  systemSent,
  ...(isNewSession && {
    lastChannel: undefined,
    lastTo: undefined,
    lastAccountId: undefined,
    lastThreadId: undefined,
    deliveryContext: undefined,
    // sessionFile was NOT here — it survives into the new entry
  }),
};

Downstream, resolveSessionFilePath in src/config/sessions/paths.ts:263 prefers a persisted entry.sessionFile over recomputing a fresh path from sessionId:

export function resolveSessionFilePath(
  sessionId: string,
  entry?: { sessionFile?: string },
  opts?: SessionFilePathOptions,
): string {
  const sessionsDir = resolveSessionsDir(opts);
  const candidate = entry?.sessionFile?.trim();
  if (candidate) {
    try {
      return resolvePathWithinSessionsDir(sessionsDir, candidate, { agentId: opts?.agentId });
    } catch { /* … */ }
  }
  return resolveSessionTranscriptPathInDir(sessionId, sessionsDir);
}

So the returned entry ends up with a new sessionId but the old sessionFile. Every forced-new run appends to the same physical transcript file as the previous run, indefinitely.

Impact

This defeats two documented features:

agents.defaults.heartbeat.isolatedSession: true — docs/gateway/heartbeat.md promises "each heartbeat runs in a fresh session with no prior conversation history" and quotes ~100K tokens down to ~2-5K per run. Neither holds while the stale sessionFile survives — every heartbeat turn reads the full accumulated transcript on context load.
Cron sessionTarget: "isolated" — isolated cron runs take the same forceNew: true path via resolveCronSession, so the same transcript pollution affects cron jobs configured for full isolation.

The silent failure mode is particularly painful because it looks like heartbeat isolatedSession is working (new sessionId, new store entry, no forceNew warnings) while the underlying transcript file continues to accumulate every prior run's history.

Fix

One field added to the existing isNewSession cleanup block, with a comment explaining the downstream interaction with resolveSessionFilePath:

     ...(isNewSession && {
       lastChannel: undefined,
       lastTo: undefined,
       lastAccountId: undefined,
       lastThreadId: undefined,
       deliveryContext: undefined,
+      sessionFile: undefined,
     }),

With sessionFile cleared on the isNewSession branch, resolveSessionFilePath falls through candidate = entry?.sessionFile?.trim() (which is now undefined) and computes the path from the new sessionId via resolveSessionTranscriptPathInDir. This is the exact intent of the existing cleanup block: strip anything that leaked from the prior session into a fresh one.

Test coverage

Added to the existing describe("session reuse for webhooks/cron") block in src/cron/isolated-agent/session.test.ts:

clears sessionFile when forceNew is true — covers the heartbeat isolatedSession and isolated cron paths.
clears sessionFile when session is stale — covers the freshness-expiry path for direct-style cron/webhook sessions.
preserves sessionFile when reusing a fresh session — locks in the negative: reuse must keep the transcript, because the whole point of reuse is that the transcript keeps accumulating.

Verification

✅ npx vitest run src/cron/isolated-agent/session.test.ts — 13/13 pass (the 3 new ones included).
✅ npx vitest run src/cron/isolated-agent/session.test.ts src/config/sessions/sessions.test.ts src/config/sessions/store.lock.test.ts src/config/sessions/transcript.test.ts — 41/41 pass across the four related test files, no regressions in transcript/store code.

What this does NOT change

No changes to heartbeat-runner.ts or its forceNew: true call — the fix is purely in the session-roll helper so every caller benefits (isolated cron, heartbeat, webhook).
No changes to resolveSessionFilePath — it still honors a persisted sessionFile when present, which is the correct behavior for non-isolated session reuse.
No config schema or docs changes — the documented behavior is now actually what the code does.

🤖 Generated with Claude Code

Changed files

src/cron/isolated-agent/session.test.ts (modified, +51/-0)
src/cron/isolated-agent/session.ts (modified, +5/-0)

Code Example

if (useIsolatedSession) {
  …
  const cronSession = resolveCronSession({
    cfg,
    sessionKey: isolatedSessionKey,
    agentId,
    nowMs: startedAt,
    forceNew: true,
  });

---

if (!params.forceNew && entry?.sessionId) {
  // reuse-or-roll logic
} else {
  // No existing session or forced new
  sessionId = crypto.randomUUID();       // ✓ new id
  isNewSession = true;
  systemSent = false;
}

…

const sessionEntry: SessionEntry = {
  // Preserve existing per-session overrides even when rolling to a new sessionId.
  ...entry,                // ← this carries sessionFile forward
  sessionId,               // ← overridden
  updatedAt: params.nowMs,
  systemSent,
  // When starting a fresh session (forceNew / isolated), clear delivery routing…
  ...(isNewSession && {
    lastChannel: undefined,
    lastTo: undefined,
    lastAccountId: undefined,
    lastThreadId: undefined,
    deliveryContext: undefined,
    // sessionFile is intentionally NOT in this clear list
  }),
};

---

export function resolveSessionFilePath(
  sessionId: string,
  entry?: { sessionFile?: string },
  opts?: SessionFilePathOptions,
): string {
  const sessionsDir = resolveSessionsDir(opts);
  const candidate = entry?.sessionFile?.trim();
  if (candidate) {
    try {
      return resolvePathWithinSessionsDir(sessionsDir, candidate, { agentId: opts?.agentId });
    } catch {
      // Keep handlers alive when persisted metadata is stale/corrupt.
    }
  }
  return resolveSessionTranscriptPathInDir(sessionId, sessionsDir);
}

---

const baseEntry = params.sessionEntry ?? sessionStore[sessionKey] ?? { sessionId, updatedAt: Date.now() };
const fallbackSessionFile = params.fallbackSessionFile?.trim();
const entryForResolve =
  !baseEntry.sessionFile && fallbackSessionFile
    ? { ...baseEntry, sessionFile: fallbackSessionFile }
    : baseEntry;
const sessionFile = resolveSessionFilePath(sessionId, entryForResolve, {
  agentId: params.agentId,
  sessionsDir: params.sessionsDir,
});

---

{
  "sessionId": "f198e48b-84e6-4695-aabc-c4fed74d7cd1",
  "sessionFile": "/data/agents/main/sessions/34db8152-a3ac-4e7a-8c4a-9a38d9525339.jsonl",
  "updatedAt": 1775904199350
}

RAW_BUFFERClick to expand / collapse

Summary

This is provable from the code alone without any production evidence.

Docs intent

Both docs/gateway/heartbeat.md and docs/gateway/configuration-reference.md describe the same behavior. Direct quotes from the repo:

docs/gateway/heartbeat.md:42 — isolatedSession: true, // optional: fresh session each run (no conversation history)
docs/gateway/heartbeat.md:227 — isolatedSession: when true, each heartbeat runs in a fresh session with no prior conversation history. Uses the same isolation pattern as cron sessionTarget: "isolated". Dramatically reduces per-heartbeat token cost. Combine with lightContext: true for maximum savings. Delivery routing still uses the main session context.
docs/gateway/heartbeat.md:441 — Use isolatedSession: true to avoid sending full conversation history (~100K tokens down to ~2-5K per run).
docs/gateway/configuration-reference.md:1240 — when true, each heartbeat runs in a fresh session with no prior conversation history. Same isolation pattern as cron sessionTarget: "isolated". Reduces per-heartbeat token cost from ~100K to ~2-5K tokens.

The ~100K → ~2–5K promise only makes sense if each run actually starts from an empty transcript.

Code walk — why the bug is provable without runtime evidence

Step 1: heartbeat-runner.ts passes `forceNew: true` unconditionally

src/infra/heartbeat-runner.ts:808-824:

if (useIsolatedSession) {
  …
  const cronSession = resolveCronSession({
    cfg,
    sessionKey: isolatedSessionKey,
    agentId,
    nowMs: startedAt,
    forceNew: true,
  });

Where useIsolatedSession = heartbeat?.isolatedSession === true. So when our config sets isolatedSession: true, forceNew is always true.

Step 2: `resolveCronSession` generates a new sessionId but preserves the old sessionFile via spread

src/cron/isolated-agent/session.ts (abbreviated to the relevant branch):

if (!params.forceNew && entry?.sessionId) {
  // reuse-or-roll logic
} else {
  // No existing session or forced new
  sessionId = crypto.randomUUID();       // ✓ new id
  isNewSession = true;
  systemSent = false;
}

…

const sessionEntry: SessionEntry = {
  // Preserve existing per-session overrides even when rolling to a new sessionId.
  ...entry,                // ← this carries sessionFile forward
  sessionId,               // ← overridden
  updatedAt: params.nowMs,
  systemSent,
  // When starting a fresh session (forceNew / isolated), clear delivery routing…
  ...(isNewSession && {
    lastChannel: undefined,
    lastTo: undefined,
    lastAccountId: undefined,
    lastThreadId: undefined,
    deliveryContext: undefined,
    // sessionFile is intentionally NOT in this clear list
  }),
};

The isNewSession cleanup block clears delivery-routing state but not sessionFile. The returned entry has a new sessionId and the OLD sessionFile.

Step 3: `resolveSessionFilePath` prefers persisted `sessionFile` over recomputing from `sessionId`

src/config/sessions/paths.ts:263:

export function resolveSessionFilePath(
  sessionId: string,
  entry?: { sessionFile?: string },
  opts?: SessionFilePathOptions,
): string {
  const sessionsDir = resolveSessionsDir(opts);
  const candidate = entry?.sessionFile?.trim();
  if (candidate) {
    try {
      return resolvePathWithinSessionsDir(sessionsDir, candidate, { agentId: opts?.agentId });
    } catch {
      // Keep handlers alive when persisted metadata is stale/corrupt.
    }
  }
  return resolveSessionTranscriptPathInDir(sessionId, sessionsDir);
}

Because the entry always has a sessionFile after the first-ever run, the if (candidate) branch is taken and the function never falls through to computing a fresh path from sessionId.

Step 4: `resolveAndPersistSessionFile` fallback is also gated on `!baseEntry.sessionFile`

src/config/sessions/session-file.ts:17-27:

const baseEntry = params.sessionEntry ?? sessionStore[sessionKey] ?? { sessionId, updatedAt: Date.now() };
const fallbackSessionFile = params.fallbackSessionFile?.trim();
const entryForResolve =
  !baseEntry.sessionFile && fallbackSessionFile
    ? { ...baseEntry, sessionFile: fallbackSessionFile }
    : baseEntry;
const sessionFile = resolveSessionFilePath(sessionId, entryForResolve, {
  agentId: params.agentId,
  sessionsDir: params.sessionsDir,
});

fallbackSessionFile is only used when !baseEntry.sessionFile. Same gate — stale sessionFile wins.

Step 5: heartbeat-runner.ts' own cleanup does not rotate the current file

src/infra/heartbeat-runner.ts:825-861 — the archiveRemovedSessionTranscripts call only processes files from removedSessionFiles, which is populated exclusively from staleIsolatedSessionKey (the separate :heartbeat:heartbeat suffix-collapse case). It never touches cronSession.sessionEntry.sessionFile even though that file has, from the runner's perspective, been logically "rolled". So the old transcript stays on disk and stays in the store entry.

End-to-end consequence

After the first run that ever creates the heartbeat session entry:

sessionId rolls on every invocation ✓
sessionFile is frozen forever ✗
The transcript writer appends every new run's messages to the same physical file
The transcript reader loads the whole file on every run, so the model sees all prior runs as in-context history

"Fresh session per run" does not hold for any run past the first. lightContext: true's documented ~100K → ~2–5K token savings silently regresses as the file grows.

Forensic evidence from a live deployment

This was caught on a production heartbeat agent running 15-minute ticks with isolatedSession: true, lightContext: true.

sessions.json entry for agent:main:main:heartbeat:

{
  "sessionId": "f198e48b-84e6-4695-aabc-c4fed74d7cd1",
  "sessionFile": "/data/agents/main/sessions/34db8152-a3ac-4e7a-8c4a-9a38d9525339.jsonl",
  "updatedAt": 1775904199350
}

Three different UUIDs are observable across one logical session:

34db8152… — the filename (set when the file was first created, never rotated)
cbc883fc-6486-40d2-b0d1-6a102861f5df — the session id written into the transcript header (original first run, timestamp: "2026-04-10T22:09:51.937Z")
f198e48b… — sessionId currently in the store (generated by the most recent forceNew)

File stats after ~13 hours of 15-minute heartbeats:

metric	value
lines	390
size	~1.05 MB
distinct session records in the file	1
occurrences of the acknowledgment sentinel string	87
oldest transcript entry	`2026-04-10T22:09:51Z`
newest transcript entry	`2026-04-11T10:43:04Z`

The model began few-shot-learning from its own ~87 prior responses on every run. Thinking traces reference the accumulated precedents as if they were protocol ("per the protocol", "the standard response") — it is following patterns set by a polluted transcript rather than the heartbeat prompt.

Impact

isolatedSession: true silently does nothing after the first run. Every existing deployment that relies on it has been running without isolation.
lightContext: true's documented token savings are optimistic by ~20x. The ~100K → ~2-5K figure only holds on the first run; every subsequent run incurs the full accumulated transcript.
Model behavior drifts toward the reinforced pattern. Any acknowledgment, summary, or tool-call sequence from an early run gets few-shot-learned by later runs and becomes sticky — even after config or prompt changes intended to alter the behavior.
Context-overflow cliff. On a long-running deployment the file eventually exceeds the model's context window. Compaction on a transcript that is mostly tool-call noise fires compaction-safeguard: no real conversation messages to summarize and only writes a boundary marker, giving little real relief.
No user-visible workaround. There is no chat/CLI command that resets a non-current session, so affected users have to rm the file by hand from the pod's filesystem.

Suggested fix

Add sessionFile: undefined to the isNewSession cleanup block in src/cron/isolated-agent/session.ts, right next to the existing delivery-routing clears. When the entry is returned with sessionFile undefined, resolveSessionFilePath correctly falls through to resolveSessionTranscriptPathInDir(sessionId, …) and a fresh transcript file is created for the new sessionId.

A PR with the one-line fix plus three regression tests (clears sessionFile when forceNew is true, clears sessionFile when session is stale, preserves sessionFile when reusing fresh session) is incoming from alexander-applyinnovations/openclaw:fix/heartbeat-isolated-session-file-rotation.

#64196 — the llama.cpp overflow detection fix that ended up masking how severe the transcript accumulation is; without that fix the same deployment was wedging silently on raw 400s instead of compacting into noise.

AI-assisted

Drafted with Claude Code (Claude Opus 4.6, 1M context), reviewed and verified by the author.
Code walk reviewed line-by-line against main at the time of filing.

extent analysis

TL;DR

The issue can be fixed by adding sessionFile: undefined to the isNewSession cleanup block in src/cron/isolated-agent/session.ts to ensure a fresh transcript file is created for each new session.

Guidance

Review the src/cron/isolated-agent/session.ts file and add sessionFile: undefined to the isNewSession cleanup block to fix the issue.
Verify that the fix works by checking that a new transcript file is created for each new session and that the old transcript file is not appended to.
Test the fix with regression tests, such as clears sessionFile when forceNew is true, clears sessionFile when session is stale, and preserves sessionFile when reusing fresh session.
Consider reviewing related issues, such as #64196, to ensure that the fix does not introduce any new problems.

Example

if (isNewSession) {
  // ...
  sessionFile: undefined, // add this line to fix the issue
  // ...
}

Notes

The fix is specific to the src/cron/isolated-agent/session.ts file and may not apply to other parts of the codebase.
The issue is caused by the sessionFile not being cleared when a new session is created, resulting in the transcript file being appended to instead of rotated.

Recommendation

Apply the suggested fix by adding sessionFile: undefined to the isNewSession cleanup block in src/cron/isolated-agent/session.ts. This fix is specific to the issue described and should resolve the problem of transcript files not being rotated correctly.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#conversation history #memory optimization #batch processing #GPU compatibility #latency issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix heartbeat: isolatedSession: true silently reuses the same transcript file across every run [4 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #64797: fix(agents): clear sessionFile when rolling a fresh isolated session

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan

User-visible / Behavior Changes

Diagram

Security Impact (required)

Repro + Verification

Environment

Steps (without this fix)

Expected

Actual (before fix)

Actual (with this fix)

Evidence

Forensic data from a live deployment

Log correlation (different sessionIds, same sessionFile)

Provable from code alone

Human Verification (required)

Compatibility / Migration

Risks and Mitigations

AI-assisted

Changed files

PR #64808: fix(agents): archive rotated heartbeat transcript on isolatedSession rotation

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan

User-visible / Behavior Changes

Diagram

Security Impact (required)

Repro + Verification

Environment

Steps (after #64797 lands, without this PR)

Expected

Actual (without this PR)

Actual (with this PR)

Evidence

Human Verification (required)

Compatibility / Migration

Risks and Mitigations

AI-assisted

Changed files

PR #64832: fix(agents): archive orphaned isolated-session transcripts after rotation

Description (problem / solution / changelog)

Summary

Change

Forensic evidence

Scope boundary

Safety

Prior state of this PR

Changed files

PR #64873: fix(cron): clear sessionFile on forceNew so isolated runs don't share transcripts

Description (problem / solution / changelog)

Summary

Root cause

Impact

Fix

Test coverage

Verification

What this does NOT change

Changed files

Code Example

Summary

Docs intent

Code walk — why the bug is provable without runtime evidence

Step 1: heartbeat-runner.ts passes forceNew: true unconditionally

Log correlation (different `sessionId`s, same `sessionFile`)

Step 1: heartbeat-runner.ts passes `forceNew: true` unconditionally

Step 2: `resolveCronSession` generates a new sessionId but preserves the old sessionFile via spread

Step 3: `resolveSessionFilePath` prefers persisted `sessionFile` over recomputing from `sessionId`

Step 4: `resolveAndPersistSessionFile` fallback is also gated on `!baseEntry.sessionFile`