openclaw - 💡(How to fix) Fix Feature: tiered context compaction (tool-output-first compression) [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#58978Fetched 2026-04-08 02:30:26
View on GitHub
Comments
2
Participants
2
Timeline
2
Reactions
0
Author
Timeline (top)
commented ×2
RAW_BUFFERClick to expand / collapse

Motivation

Large tool outputs from exec, web_fetch, and similar tools consume context disproportionately. A single web_fetch call can return 20–50KB of markdown, and exec outputs from build logs or database queries can be even larger. When context limits are reached, the current compaction strategy summarizes the entire conversation at once — losing conversational nuance, user intent, and decision history that is far more valuable per-token than raw tool output.

This creates a poor trade-off: the agent forgets why it was doing something in order to preserve what a tool returned verbatim.

Additionally, when compaction fails (e.g., due to API errors or rate limits), repeated automatic retries waste API calls without making progress.

Proposed Solution

1. Micro-compaction pre-pass

Add a lightweight compaction stage that runs before full context compaction triggers. This pre-pass would:

  • Identify tool result blocks (exec output, web_fetch content, read file contents) above a configurable threshold (e.g., >2KB).
  • Summarize each tool output individually, preserving key facts and discarding verbatim content.
  • Replace the original tool result with the summary + a marker like [compacted: tool output summarized, original was ~18KB].
  • Leave conversational turns (user messages, assistant reasoning) untouched.

This buys significant context headroom without sacrificing conversational coherence.

2. Circuit breaker for compaction failures

Add a circuit breaker that stops auto-compaction after N consecutive failures (default: 3). Behavior:

  • After N failures, pause auto-compaction for a configurable cooldown period.
  • Log a warning so the user/operator is aware.
  • Resume attempts after the cooldown, or allow manual trigger.

This prevents runaway API waste when the summarization endpoint is down or rate-limited.

Technical Notes

  • The micro-compaction pass would operate on the message history, targeting messages with role: "tool" or tool result content blocks.
  • It could use a smaller/cheaper model for summarization (e.g., Haiku) since the task is straightforward extraction, not reasoning.
  • The threshold for triggering micro-compaction vs. full compaction could be configured independently:
    • Micro-compaction at 70% context usage
    • Full compaction at 85% context usage
  • Existing [compacted] markers in the codebase suggest partial infrastructure already exists.

Alternatives Considered

AlternativeWhy not
Aggressive truncation (just cut tool outputs at N chars)Loses potentially important details without intelligent selection
Always stream tool outputs to files instead of inlineAdds complexity for the agent to manage file references; breaks simple workflows
Larger context windows onlyDoesn't solve the fundamental cost/efficiency issue; 200K contexts are expensive
User-managed compaction (manual trigger only)Poor UX; users shouldn't need to manage context budgets

extent analysis

TL;DR

Implement a micro-compaction pre-pass to summarize large tool outputs before full context compaction triggers, and add a circuit breaker to prevent excessive API calls during compaction failures.

Guidance

  • Identify tool result blocks above a configurable threshold (e.g., >2KB) and summarize them individually to preserve key facts and discard verbatim content.
  • Implement a circuit breaker that stops auto-compaction after N consecutive failures (default: 3) and pauses for a configurable cooldown period to prevent API waste.
  • Configure the threshold for triggering micro-compaction vs. full compaction independently, such as micro-compaction at 70% context usage and full compaction at 85% context usage.
  • Consider using a smaller/cheaper model for summarization, such as Haiku, since the task is straightforward extraction, not reasoning.

Example

No explicit code example is provided, but the proposed solution suggests modifying the existing compaction strategy to include a micro-compaction pre-pass and circuit breaker.

Notes

The proposed solution assumes that the existing [compacted] markers in the codebase provide partial infrastructure for implementing the micro-compaction pre-pass. However, the exact implementation details may vary depending on the specific requirements and constraints of the project.

Recommendation

Apply the proposed workaround by implementing the micro-compaction pre-pass and circuit breaker to improve the efficiency and effectiveness of the compaction strategy. This approach addresses the trade-off between preserving conversational nuance and raw tool output, and prevents excessive API waste during compaction failures.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Feature: tiered context compaction (tool-output-first compression) [2 comments, 2 participants]