openclaw - 💡(How to fix) Fix [Bug]: large session checkpoint artifacts drive read I/O and memory pressure [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Large compaction checkpoint JSONL snapshots are retained by count only, allowing many 14-16 MB checkpoint files per session and increasing disk/page-cache pressure during gateway background work.

Root Cause

The live sample retained many large checkpoint snapshots because checkpoint metadata was trimmed by count only. The largest sampled files were approximately 14.4 MB to 15.8 MB each, and earlier live sampling during the same pressure investigation showed high gateway read I/O.

Fix Action

Fixed

Code Example

Redacted path/identity note: file contents were not read or pasted. Local usernames, agent names, session ids, and exact local paths are redacted.

find [redacted agent sessions root] -type f -name '*.jsonl' -printf '%s %p\n' | sort -n | tail -n 25

14447080 [redacted session checkpoint path]
14585352 [redacted session checkpoint path]
14599853 [redacted session checkpoint path]
14629847 [redacted session checkpoint path]
14647026 [redacted session checkpoint path]
14674312 [redacted session checkpoint path]
14708346 [redacted session checkpoint path]
14746008 [redacted session checkpoint path]
14771178 [redacted session checkpoint path]
14785596 [redacted session checkpoint path]
14811056 [redacted session checkpoint path]
14849148 [redacted session checkpoint path]
14884341 [redacted active session transcript path]
14960622 [redacted session checkpoint path]
15008210 [redacted session checkpoint path]
15081120 [redacted session checkpoint path]
15136799 [redacted session checkpoint path]
15174219 [redacted session checkpoint path]
15211794 [redacted session checkpoint path]
15436664 [redacted session checkpoint path]
15481443 [redacted session checkpoint path]
15547023 [redacted session checkpoint path]
15702680 [redacted session checkpoint path]
15847347 [redacted session checkpoint path]
15883964 [redacted active session transcript path]

Earlier live sampling during the same pressure investigation:
pidstat -d -p [redacted gateway pid] 1 5 average: kB_rd/s=40768.06 kB_wr/s=835.13
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Large compaction checkpoint JSONL snapshots are retained by count only, allowing many 14-16 MB checkpoint files per session and increasing disk/page-cache pressure during gateway background work.

Steps to reproduce

  1. Run a long agent session that triggers repeated compaction checkpoints.
  2. Inspect the agent session directory for .checkpoint.<uuid>.jsonl transcript snapshots.
  3. Observe many retained checkpoint snapshots around 14-16 MB each, plus active session JSONL files of similar size.

Expected behavior

Checkpoint retention should enforce a total byte budget per session so repeated compactions cannot retain hundreds of megabytes of full transcript snapshots for one active session.

Actual behavior

The live sample retained many large checkpoint snapshots because checkpoint metadata was trimmed by count only. The largest sampled files were approximately 14.4 MB to 15.8 MB each, and earlier live sampling during the same pressure investigation showed high gateway read I/O.

OpenClaw version

NOT_ENOUGH_INFO

Operating system

Ubuntu/WSL2

Install method

Live development gateway

Model

NOT_ENOUGH_INFO

Provider / routing chain

NOT_ENOUGH_INFO

Additional provider/model setup details

This is gateway session storage behavior and is not provider-specific.

Logs, screenshots, and evidence

Redacted path/identity note: file contents were not read or pasted. Local usernames, agent names, session ids, and exact local paths are redacted.

find [redacted agent sessions root] -type f -name '*.jsonl' -printf '%s %p\n' | sort -n | tail -n 25

14447080 [redacted session checkpoint path]
14585352 [redacted session checkpoint path]
14599853 [redacted session checkpoint path]
14629847 [redacted session checkpoint path]
14647026 [redacted session checkpoint path]
14674312 [redacted session checkpoint path]
14708346 [redacted session checkpoint path]
14746008 [redacted session checkpoint path]
14771178 [redacted session checkpoint path]
14785596 [redacted session checkpoint path]
14811056 [redacted session checkpoint path]
14849148 [redacted session checkpoint path]
14884341 [redacted active session transcript path]
14960622 [redacted session checkpoint path]
15008210 [redacted session checkpoint path]
15081120 [redacted session checkpoint path]
15136799 [redacted session checkpoint path]
15174219 [redacted session checkpoint path]
15211794 [redacted session checkpoint path]
15436664 [redacted session checkpoint path]
15481443 [redacted session checkpoint path]
15547023 [redacted session checkpoint path]
15702680 [redacted session checkpoint path]
15847347 [redacted session checkpoint path]
15883964 [redacted active session transcript path]

Earlier live sampling during the same pressure investigation:
pidstat -d -p [redacted gateway pid] 1 5 average: kB_rd/s=40768.06 kB_wr/s=835.13

Code refs from the investigated checkout:

  • src/gateway/session-compaction-checkpoints.ts retained up to 25 checkpoints per session.
  • src/gateway/session-compaction-checkpoints.ts copied full pre-compaction transcript snapshots when each source file was under the per-snapshot cap.
  • packages/memory-host-sdk/src/host/session-files.ts already skips checkpoint files for dreaming transcript scans, so the remaining storage pressure is checkpoint retention itself.

Impact and severity

Affected: long-running gateway sessions with repeated compaction checkpoints. Severity: Medium-high; retained checkpoint files can inflate the session directory and increase disk read/page-cache pressure during gateway work. Frequency: Observed once in a live 2026-05-21 gateway performance investigation with many retained 14-16 MB checkpoint artifacts. Consequence: More retained session data than needed, higher I/O pressure, and more memory/page-cache churn under load.

Additional information

The attached fix should keep the existing count cap but also enforce a retained checkpoint snapshot byte budget per session and remove trimmed checkpoint files.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Checkpoint retention should enforce a total byte budget per session so repeated compactions cannot retain hundreds of megabytes of full transcript snapshots for one active session.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: large session checkpoint artifacts drive read I/O and memory pressure [1 pull requests]