hermes - 💡(How to fix) Fix [RFC] Session Memory Compact — zero-cost context compression for long sessions

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  1. Semantic compression — compress each user/assistant turn independently with a dedicated prompt like: "Summarize what happened in this exchange in 1-2 sentences. Keep all names, file paths, decisions, and error messages intact."
RAW_BUFFERClick to expand / collapse

Motivation

In long-running sessions, accumulated conversation history steadily eats into the context window. Eventually the model starts losing earlier context, tool outputs get truncated, and the user experiences "doesn't remember what we were doing" failures.

Hermes already has a compressor module, but it operates at a coarse level — compressing entire blocks when a threshold is reached.

Proposal: Session Memory Compact

A lightweight, always-on compression layer that:

  1. Runs at session start — before the model sees the history, re-read past turns
  2. Semantic compression — compress each user/assistant turn independently with a dedicated prompt like: "Summarize what happened in this exchange in 1-2 sentences. Keep all names, file paths, decisions, and error messages intact."
  3. Append-only — compressed versions are prepended to the history; originals remain until the token budget forces eviction
  4. No latency cost — runs in a background thread during API response wait time

Key design choices

  • Per-turn granularity: each user/assistant exchange is compressed independently, not slabs of N turns at once
  • Readable output: compressed summaries retain key identifiers (files, errors, decisions) so the model can still reference them
  • Zero-config: no thresholds to tune — always active, always benefiting
  • Works with trailing cleanup: compressed history can then be trimmed from the tail more aggressively

Benefits

  • Extends effective context by 30-50 percent in long sessions
  • No user-facing latency (runs during API waits)
  • Graceful degradation — the more verbose the session, the more tokens saved
  • Opens the door for longer autonomous runs (delegate_task, cron) without context collapse

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [RFC] Session Memory Compact — zero-cost context compression for long sessions