gemini-cli - 💡(How to fix) Fix Session JSONL files accumulate dead weight and can become too large

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
RAW_BUFFERClick to expand / collapse

Problem

Session files written by packages/core/src/services/chatRecordingService.ts are append-only JSONL. Over long sessions they can grow very large even though much of the file is no longer part of the latest live conversation state.

A concrete example in this repo is memory-tests/large-chat-session.json, which is about 54 MB. This is larger than the current read_file size limit and is impractical for background consumers like memory or skill extraction to inspect directly.

Why this happens

The loader reconstructs the latest session by streaming the file and applying records in order:

  • message records are keyed by id, so later records with the same id replace older versions
  • $set records update metadata, so earlier metadata updates are superseded
  • $rewindTo truncates the live message map, so records after the rewind target can become unreachable
  • tool result payloads are embedded in toolCalls[].result, and can be very large

This means the raw file can contain a lot of dead weight:

  • older versions of messages that were later rewritten with tokens, tool calls, or tool results
  • superseded $set records such as frequent lastUpdated updates
  • messages made unreachable by $rewindTo
  • large tool outputs that may be useful for display/debugging but are not always needed in the primary session log

Impact

  • Resume/session scanning has to stream unnecessarily large files.
  • Background services that need session evidence cannot safely ask tools to read raw session files.
  • Long-running sessions can produce files dominated by stale records and tool output rather than the latest history.

Suggested direction

Add a session compaction or checkpointing path that rewrites JSONL into a canonical current-state form:

  • one metadata record containing the latest metadata
  • only the current live message records, in order
  • no superseded message versions
  • no stale $set records
  • no rewound-away records

Separately consider capping, summarizing, or externalizing large toolCalls[].result payloads so the main session file retains structure without embedding unbounded tool output.

The existing loadConversationRecord() already has most of the semantics needed to reconstruct the latest live state; compaction could reuse that logic and then write a fresh canonical JSONL file atomically.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING