hermes - ✅(Solved) Fix Expose session compaction as a structured API primitive [2 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15801Fetched 2026-04-26 05:24:58
View on GitHub
Comments
1
Participants
1
Timeline
6
Reactions
0
Participants
Timeline (top)
labeled ×4commented ×1cross-referenced ×1

/compress currently looks like an interaction-layer command. CLI / gateway / ACP can trigger compact behavior, but HTTP clients do not appear to have a stable session compaction primitive.

For API clients, compact would be cleaner as a structured session operation instead of requiring clients to drive CLI slash-command behavior.

One possible shape:

POST /v1/sessions/{session_id}/compact
{
  "focus": "optional focus text"
}

Response shape, if compact creates a continuation session:

{
  "source_session_id": "old-session-id",
  "new_session_id": "continuation-session-id",
  "message": "Compacted context and started a continuation session.",
  "status": "completed"
}

The important part is new_session_id. API clients need a reliable continuation handle so they can hydrate the new session before switching local state.

Root Cause

Without a structured compact API, external clients have only brittle options:

  • send /compress through /v1/runs as a prompt;
  • drive the interactive CLI/TUI command path;
  • fake a local continuation session.

All three are poor API contracts. A session-level compact primitive would let CLI, ACP, Web UI, and third-party clients share the same Agent-side compact behavior while keeping presentation-layer slash commands separate.

PR fix notes

PR #10288: feat: add busy command API and compressor info for context compression

Description (problem / solution / changelog)

Problem

When /compress or automatic context compression runs, the TUI input prompt remains active and unblocked. The user can type while compression is in-flight, which causes input to be consumed by a prompt that is about to be replaced.

Additionally, there is no visible indication of which compressor is active or what settings it uses.

Solution

run_agent.py

  • _compressing flag on AIAgent -- set to True while compression is in-flight, reset in a finally block so the flag is always cleared even on error
  • _compressor_info() method -- returns a human-readable string describing the active compressor. For the built-in compressor it returns "compressor", for smart compressors (e.g. IroninCompressor) it returns "ironin-compressor (keep>=6, drop<5)"
  • Auto-compression message updated from ⟳ compacting context… to 🗜️ ironin-compressor (keep>=6, drop<5) -- compacting 510 messages (~224,081 tokens)...

cli.py

  • /compress wrapped in _busy_command() -- the existing context manager that sets _command_running = True (blocking input rendering) and shows a spinner + status message in the TUI status bar

Output

Manual compression (/compress):

⚙️  /compress
⏳ Compressing context...
🗜️  /compress  [ironin-compressor (keep>=6, drop<5)]
🗜️  Compressing 878 messages (~545,632 tokens)...
  ✅ Compressed: 878 → 142 messages (~545,632 → ~89,210 tokens)

Auto-compression:

  🗜️  ironin-compressor (keep>=6, drop<5) -- compacting 510 messages (~224,081 tokens)...

Design

This lays the groundwork for a formal busy command API that plugins and extensions can use to block input during long operations. The _busy_command() context manager already exists in cli.py; this PR demonstrates its use and adds the _compressing flag on AIAgent for the agent loop path (which runs outside the CLI).

Closes #10281

Changed files

  • cli.py (modified, +38/-31)
  • plugins/context_engine/ironin_compressor (added, +1/-0)
  • run_agent.py (modified, +23/-6)

PR #15418: fix(gateway): resume should follow compression continuations

Description (problem / solution / changelog)

Salvage of #15403 onto current main.

Summary

Gateway /resume <title> now follows compression continuation chains, matching CLI behavior (cli.py:4753). After /compress, the titled parent ends with an empty transcript and the live messages land on a child session; gateway was reopening the empty parent.

Changes

  • gateway/run.py: route resolved target through SessionDB.resolve_resume_session_id() with debug-log fallback
  • tests/gateway/test_resume_command.py: regression test — compressed parent + child continuation, assert /resume switches to child
  • scripts/release.py: AUTHOR_MAP entry for @simbam99

Validation

scripts/run_tests.sh tests/gateway/test_resume_command.py → 10/10 passing.

Credit: @simbam99 (original PR #15403, cherry-picked with authorship preserved).

Changed files

  • gateway/run.py (modified, +7/-1)
  • scripts/release.py (modified, +1/-0)
  • tests/gateway/test_resume_command.py (modified, +34/-0)

Code Example

POST /v1/sessions/{session_id}/compact

---

{
  "focus": "optional focus text"
}

---

{
  "source_session_id": "old-session-id",
  "new_session_id": "continuation-session-id",
  "message": "Compacted context and started a continuation session.",
  "status": "completed"
}
RAW_BUFFERClick to expand / collapse

Context

/compress currently looks like an interaction-layer command. CLI / gateway / ACP can trigger compact behavior, but HTTP clients do not appear to have a stable session compaction primitive.

For API clients, compact would be cleaner as a structured session operation instead of requiring clients to drive CLI slash-command behavior.

One possible shape:

POST /v1/sessions/{session_id}/compact
{
  "focus": "optional focus text"
}

Response shape, if compact creates a continuation session:

{
  "source_session_id": "old-session-id",
  "new_session_id": "continuation-session-id",
  "message": "Compacted context and started a continuation session.",
  "status": "completed"
}

The important part is new_session_id. API clients need a reliable continuation handle so they can hydrate the new session before switching local state.

Current API surface

Current gateway/platforms/api_server.py appears to expose /v1/runs, /v1/chat/completions, /v1/responses, and run events, but not a compact endpoint like POST /v1/sessions/{id}/compact:

https://github.com/NousResearch/hermes-agent/blob/main/gateway/platforms/api_server.py

Related context I found:

I also searched for existing issues/PRs around compact endpoint, session compact API, compact HTTP, and v1 sessions compact; I did not find an issue specifically tracking the HTTP API contract.

Why this matters

Without a structured compact API, external clients have only brittle options:

  • send /compress through /v1/runs as a prompt;
  • drive the interactive CLI/TUI command path;
  • fake a local continuation session.

All three are poor API contracts. A session-level compact primitive would let CLI, ACP, Web UI, and third-party clients share the same Agent-side compact behavior while keeping presentation-layer slash commands separate.

extent analysis

TL;DR

Implement a new API endpoint, POST /v1/sessions/{session_id}/compact, to provide a structured session operation for compacting sessions.

Guidance

  • Review the proposed API endpoint shape and response format to ensure it meets the requirements of API clients.
  • Investigate the current API surface in gateway/platforms/api_server.py to determine the best approach for adding the new endpoint.
  • Consider the related context and previous work on compression and compaction metadata to ensure consistency with existing functionality.
  • Evaluate the potential impact on external clients and the benefits of providing a standardized compact API.

Example

POST /v1/sessions/{session_id}/compact HTTP/1.1
Content-Type: application/json

{
  "focus": "optional focus text"
}

Notes

The implementation of the new endpoint will require careful consideration of the underlying session management and compaction logic. Additionally, the response format and any potential error handling will need to be defined.

Recommendation

Apply workaround: Implement the proposed POST /v1/sessions/{session_id}/compact endpoint to provide a reliable and standardized compact API for external clients. This will allow for a cleaner and more robust interaction with the Agent, separating presentation-layer slash commands from the underlying session management.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING