hermes - 💡(How to fix) Fix Feature: progressive background pre-compaction cache

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
RAW_BUFFERClick to expand / collapse

Feature

Add progressive/background pre-compaction so Hermes can do most of the summarization work before the session reaches the hard compression threshold.

Motivation

Current context compression is synchronous: when the threshold is hit, Hermes pauses the active turn and summarizes a large transcript. This makes long sessions feel slow exactly when the user is trying to continue working.

A better model is to maintain a validated precompact cache in the background, then use it when real compression is needed.

Proposed design

After a completed turn, if the session is above a lower soft threshold:

  1. Select only the stable prefix of the transcript.
    • Exclude active tool calls.
    • Exclude messages newer than protect_last_n.
    • Exclude anything that might be affected by /retry, /undo, queued/steered input, etc.
  2. Summarize that stable prefix in a low-priority background job.
  3. Store a candidate summary keyed by:
    • session id
    • last summarized message id
    • prefix hash
    • system prompt hash
    • compression prompt/model version
  4. On actual compression:
    • validate the prefix hash still matches
    • summarize only the delta since the candidate
    • merge candidate + delta + protected recent tail
    • rotate/split the session as today
  5. If validation fails, discard the candidate and fall back to normal synchronous compression.

Requirements / guardrails

  • Background jobs must never mutate live session messages.
  • Compression locks should still guard the final session rotation.
  • Candidate summaries must be treated as disposable cache, not canonical history.
  • Cost controls: debounce, soft threshold, max one candidate per session, optional disable.
  • Observability: report precompact hits/misses and saved synchronous summarization time.

Why this is separate from stale payload eviction

Payload eviction reduces how often compression is needed. Pre-compaction reduces the latency when compression is eventually needed. They are complementary.

Related issues / concepts

  • Existing compression path and session split behavior
  • LCM/lossless-claw style incremental summaries
  • KV-cache reuse for compression is related but different: this feature is model/provider-agnostic caching of validated summary work.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Feature: progressive background pre-compaction cache