openclaw - 💡(How to fix) Fix compaction: reserveTokens default (15000) leaves insufficient headroom for structured summarization at typical session sizes [1 comments, 2 participants]

openclaw2026-05-05 09:30:31

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#77780•Fetched 2026-05-06 06:21:35

View on GitHub

Comments

Participants

Timeline

Reactions

Author

axonrelaybot

Participants

axonrelaybot

clawsweeper[bot]

Timeline (top)

mentioned ×3subscribed ×3commented ×1

The default reserveTokens: 15000 threshold causes compaction to fire when only ~15k tokens remain in the context window. At this point, the summarization model itself is severely context-constrained and cannot produce a structured, task-preserving summary. Instead it falls back to a verbatim transcript tail — silently dropping all goal state, in-progress task queues, and approved work.

Error Message

Manual /compact triggered at 182,615 tokens with reserveTokens: 15000 (200k context window):

Summarization model had ~15k tokens to produce a summary of 182k tokens of prior context
Result: verbatim transcript replay stored as summary field — not an LLM-generated summary
All structured state (goals, in-progress tasks, approved task queues) silently dropped
Contrast: safeguard compaction that fired earlier at 123,845 tokens produced a fully structured summary (## Goal, ## Progress, ## Done, ## Pending)

Root Cause

Compaction summarization is itself a model call. At 182k/200k fill, only 15k tokens remain for the summarization model to:

Read the prior conversation context (injected as input)
Generate a structured summary (output)

15k tokens is insufficient for both at any realistic session length. The summarizer is token-starved and cannot produce structured output. It effectively echoes back the tail of the transcript instead.

This means: the sessions most in need of a high-quality structured summary are exactly the sessions that receive the worst summary quality, because they only trigger compaction when they're already critically full.

RAW_BUFFERClick to expand / collapse

Summary

Observed behavior

Manual /compact triggered at 182,615 tokens with reserveTokens: 15000 (200k context window):

Summarization model had ~15k tokens to produce a summary of 182k tokens of prior context
Result: verbatim transcript replay stored as summary field — not an LLM-generated summary
All structured state (goals, in-progress tasks, approved task queues) silently dropped
Contrast: safeguard compaction that fired earlier at 123,845 tokens produced a fully structured summary (## Goal, ## Progress, ## Done, ## Pending)

The difference is not a code path difference between manual and safeguard compaction — it is purely a function of how much headroom the summarization model has when it runs.

Root cause

Compaction summarization is itself a model call. At 182k/200k fill, only 15k tokens remain for the summarization model to:

Read the prior conversation context (injected as input)
Generate a structured summary (output)

User impact

Every session that hits the reserveTokens: 15000 threshold (fires at ~92% fill on a 200k window) has been silently degrading compaction quality
Approved task queues, in-flight investigations, and structured goal state are lost at exactly the moment context pressure is highest
Users experience this as "the agent forgot" — with no visible signal that compaction quality was degraded
This is not a new edge case: it is the default behavior for any user running a long or investigation-heavy session

Proposed fix

Raise the default reserveTokens threshold substantially. Based on testing:

reserveTokens: 30000 (fires at ~85% fill / ~170k tokens on 200k window) — sufficient headroom for structured summarization at typical session lengths
reserveTokens: 40000 (fires at ~80% fill / ~160k tokens) — more headroom, slightly more frequent compactions on heavy days

The structured safeguard-triggered compaction at 123k tokens confirms that ~77k tokens of remaining headroom produces a correct structured summary. The right default is somewhere between 15k (clearly insufficient) and 77k (clearly sufficient).

A minimum viable fix: reserveTokens: 25000–30000. This alone would have prevented the quality degradation observed.

Additional recommendation

Consider adding a warning or metric when the summarization result appears to be a verbatim transcript tail rather than a structured summary — e.g. if the summary contains raw tool results, JSON blobs, or turn-format markers (- User:, - Assistant:, - Tool result:). This would make the failure mode visible rather than silent.

Environment

OpenClaw 2026.5.3-1 (2eae30e)
Context window: 200k (venice/claude-sonnet-4-6)
Default reserveTokens: 15000
Session fill at manual compaction: 182,615 tokens (~91%)
Session fill at safeguard compaction (good quality): 123,845 tokens (~62%)

extent analysis

TL;DR

Raising the default reserveTokens threshold to at least 25000-30000 can prevent compaction quality degradation by providing sufficient headroom for structured summarization.

Guidance

Increase the reserveTokens threshold to a value between 25000-30000 to ensure the summarization model has enough tokens to produce a structured summary.
Consider implementing a warning or metric to detect when the summarization result is a verbatim transcript tail rather than a structured summary.
Test different reserveTokens values to find the optimal threshold for typical session lengths.
Review the session fill percentages to determine the best threshold for preventing compaction quality degradation.

Example

No code snippet is provided as the issue is related to configuration and model performance.

Notes

The optimal reserveTokens threshold may vary depending on the specific use case and session lengths. Testing and monitoring are necessary to determine the best value.

Recommendation

Apply a workaround by raising the default reserveTokens threshold to at least 25000-30000 to prevent compaction quality degradation. This change can help ensure that the summarization model has sufficient headroom to produce structured summaries, especially in sessions with high context pressure.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#embedding generation #cache error #pipeline error #runtime error #dependency conflict

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix compaction: reserveTokens default (15000) leaves insufficient headroom for structured summarization at typical session sizes [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Summary

Observed behavior

Root cause

User impact

Proposed fix

Additional recommendation

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix compaction: reserveTokens default (15000) leaves insufficient headroom for structured summarization at typical session sizes [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Summary

Observed behavior

Root cause

User impact

Proposed fix

Additional recommendation

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING