compression: gateway: threshold: 0.85 model: claude-sonnet-4 # cheaper model for summarization preserve_recent: 10 # keep last 10 turns verbatim agent: threshold: 0.50 auto: true preserve_tool_state: true

Feature Request: Dual-Layer Context Compression

Problem Statement

OpenClaw conversations degrade as the context window fills up. Users must manually run /compress or start new sessions, breaking flow and losing continuity. For the core use case — long-running persistent agents (the AGENTS.md + SOUL.md + MEMORY.md pattern) — this is a hard ceiling on usefulness.

A typical power-user session hits context limits within 2-4 hours of active use. At that point, the agent loses older context silently or errors out, forcing a manual restart.

Proposed Solution

Implement two independent compression layers:

Gateway Pre-Compression (~85% context usage)

Triggered automatically when context reaches ~85% of the model's window.
Summarizes older conversation turns while preserving:
- Recent turns (last N messages, configurable)
- Active tool call state and pending operations
- Key facts and decisions from the session
Uses a configurable summarization model (can be cheaper than the active model).

Agent-Level Compression (~50% context usage, configurable)

More aggressive summarization targeting ~50% reduction.
Preserves task-critical state: current goals, file paths, error context.
Generates a structured "session state" object rather than prose summary.
Can be triggered manually or automatically.

Configuration

compression:
  gateway:
    threshold: 0.85
    model: claude-sonnet-4  # cheaper model for summarization
    preserve_recent: 10     # keep last 10 turns verbatim
  agent:
    threshold: 0.50
    auto: true
    preserve_tool_state: true

Transparency

Log compression events with metadata (tokens before/after, what was compressed).
Optional user notification: "Context compressed — older messages summarized."
Never lose active tool state or pending operations.

User Impact

Power users: Indefinite conversation sessions without manual intervention.
Agent workflows: AGENTS.md patterns (memory, heartbeats, background tasks) work reliably over 8+ hour sessions.
All users: No more "context full" errors or mysterious quality degradation.

This is the single biggest UX gap between OpenClaw and Hermes Agent. Hermes ships dual-layer compression and it fundamentally changes the experience — conversations feel unlimited.

Technical Considerations

Summarization quality: Compression is lossy. Important details can be lost. The structured state approach (agent-level) mitigates this.
Tool state preservation: Active exec sessions, file edit contexts, and multi-step workflows must survive compression.
Model-aware thresholds: Different models have different context windows; thresholds should be absolute token counts or percentages.
Cost: Summarization calls add cost, but less than restarting sessions or using larger context windows.
Testing: Need integration tests that verify state preservation across compression boundaries.

Priority

HIGHEST. This is table-stakes for any agent platform targeting persistent, long-running sessions. Without it, OpenClaw's most powerful patterns (AGENTS.md, heartbeats, memory systems) hit a hard wall every few hours. Hermes Agent has shipped this and users cite it as a primary reason for switching.

extent analysis

Fix Plan

To implement dual-layer context compression, follow these steps:

Gateway Pre-Compression:
1. Introduce a new summarization model (claude-sonnet-4) for gateway pre-compression.
2. Set up a threshold (0.85) to trigger automatic compression when context reaches 85% of the model's window.
3. Implement a function to summarize older conversation turns while preserving recent turns, active tool call state, and key facts.
Agent-Level Compression:
1. Develop a more aggressive summarization algorithm targeting 50% reduction.
2. Preserve task-critical state, such as current goals, file paths, and error context.
3. Generate a structured "session state" object instead of a prose summary.
Configuration: Update the configuration file (compression.yaml) to include the following settings:

compression: gateway: threshold: 0.85 model: claude-sonnet-4 preserve_recent: 10 agent: threshold: 0.50 auto: true preserve_tool_state: true

* **Transparency**:
1. Log compression events with metadata (tokens before/after, what was compressed).
2. Implement optional user notification: "Context compressed — older messages summarized."

### Example Code
Here's an example of how the gateway pre-compression function could be implemented in Python:
```python
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

def gateway_pre_compression(context, threshold, model, preserve_recent):
  # Load the summarization model and tokenizer
  summarization_model = AutoModelForSeq2SeqLM.from_pretrained(model)
  tokenizer = AutoTokenizer.from_pretrained(model)

  # Calculate the number of tokens to compress
  num_tokens_to_compress = int(len(context) * (1 - threshold))

  # Summarize the older conversation turns
  input_ids = tokenizer.encode(context[:-preserve_recent], return_tensors='pt')
  output = summarization_model.generate(input_ids, max_length=128)

  # Preserve recent turns, active tool call state, and key facts
  preserved_context = context[-preserve_recent:]
  preserved_context += tokenizer.decode(output[0], skip_special_tokens=True)

  return preserved_context

Verification

To verify that the fix worked, test the following scenarios:

Compression is triggered automatically when context reaches 85% of the model's window.
Compression preserves recent turns, active tool call state, and key facts.
Agent-level compression reduces context by 50% while preserving task-critical state.
Log compression events with metadata and optional user notification.

Extra Tips

Monitor the performance of the summarization models and adjust the thresholds as needed.
Consider implementing a feedback mechanism to improve the

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Feature Request: Dual-Layer Context Compression [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Feature Request: Dual-Layer Context Compression

Problem Statement

Proposed Solution

Gateway Pre-Compression (~85% context usage)

Agent-Level Compression (~50% context usage, configurable)

Configuration

Transparency

User Impact

Technical Considerations

Priority

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Feature Request: Dual-Layer Context Compression [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Feature Request: Dual-Layer Context Compression

Problem Statement

Proposed Solution

Gateway Pre-Compression (~85% context usage)

Agent-Level Compression (~50% context usage, configurable)

Configuration

Transparency

User Impact

Technical Considerations

Priority

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING