openclaw - 💡(How to fix) Fix Auto-compaction preemptive check: no debug logging when route='fits'; cache-ttl pruning may mask estimate for large-context models [1 participants]

openclaw2026-04-18 14:57:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#68609•Fetched 2026-04-19 15:09:37

View on GitHub

Comments

Participants

Timeline

Reactions

Author

matthewcarano

Participants

matthewcarano

The preemptive auto-compaction check (shouldPreemptivelyCompactBeforePrompt) exhibited two issues with main-agent Opus sessions:

At the default 200K budget, it never triggered preemptive compaction. Opus sessions grew to 167K+ tokens with zero compactions, requiring manual intervention or hitting actual context overflow (timeout-compaction).
After lowering contextTokens to 150K, preemptive compaction eventually fired — confirming the mechanism works but the threshold interaction with cache-ttl pruning and large bootstrap loads creates a window where the estimate stays artificially low.

A critical observability gap compounds the issue: shouldPreemptivelyCompactBeforePrompt emits no log when route === "fits", making it impossible to distinguish "precheck ran and decided it fits" from "precheck never ran."

Error Message

Opus main-agent session climbed to 167K+ tokens (budget 200K) with zero compactions. Manual compaction required.
Log audit of gateway.log (full history): prior to the fix, every auto-compaction event was for Sonnet. Every Opus compaction event was timeout-compaction (overflow recovery).
On Apr 13, an Opus session hit context overflow and threw 15 consecutive "prompt too large" errors in rapid succession without recovery.
After lowering agents.defaults.contextTokens to 150K and starting a fresh session: auto-compaction eventually fired at ~10:33 EDT on Apr 18, confirmed in gateway.log: auto-compaction succeeded for anthropic/claude-opus-4-6; retrying prompt
The session /status initially showed "Compactions: 1" which was a carried-forward artifact from a prior session summary, not a new event. Real compaction count reached 2 after the auto-compaction fired.

Root Cause

The preemptive auto-compaction check (shouldPreemptivelyCompactBeforePrompt) exhibited two issues with main-agent Opus sessions:

At the default 200K budget, it never triggered preemptive compaction. Opus sessions grew to 167K+ tokens with zero compactions, requiring manual intervention or hitting actual context overflow (timeout-compaction).
After lowering contextTokens to 150K, preemptive compaction eventually fired — confirming the mechanism works but the threshold interaction with cache-ttl pruning and large bootstrap loads creates a window where the estimate stays artificially low.

Code Example

agents.defaults.contextTokens: 150000   (also tested at 170000, 200000)
compaction.mode: "default"
compaction.model: "anthropic/claude-sonnet-4-5"
compaction.reserveTokens: 50000
compaction.reserveTokensFloor: 30000
compaction.memoryFlush.softThresholdTokens: 15000
contextPruning.mode: "cache-ttl"   (aggressive, 30min TTL on tool results)

RAW_BUFFERClick to expand / collapse

Environment

OpenClaw version: 2026.4.15 (also reproduced on 2026.4.11)
Platform: macOS (Mac Studio, native install)
Auth mode: Anthropic OAuth (mode: "token")
Main agent model: anthropic/claude-opus-4-6
Subagent / heartbeat model: anthropic/claude-sonnet-4-5
Compaction model: anthropic/claude-sonnet-4-5
Interface: Discord (channels/threads as sessions)

Relevant config

agents.defaults.contextTokens: 150000   (also tested at 170000, 200000)
compaction.mode: "default"
compaction.model: "anthropic/claude-sonnet-4-5"
compaction.reserveTokens: 50000
compaction.reserveTokensFloor: 30000
compaction.memoryFlush.softThresholdTokens: 15000
contextPruning.mode: "cache-ttl"   (aggressive, 30min TTL on tool results)

Summary

The preemptive auto-compaction check (shouldPreemptivelyCompactBeforePrompt) exhibited two issues with main-agent Opus sessions:

At the default 200K budget, it never triggered preemptive compaction. Opus sessions grew to 167K+ tokens with zero compactions, requiring manual intervention or hitting actual context overflow (timeout-compaction).
After lowering contextTokens to 150K, preemptive compaction eventually fired — confirming the mechanism works but the threshold interaction with cache-ttl pruning and large bootstrap loads creates a window where the estimate stays artificially low.

Observed behavior

Opus main-agent session climbed to 167K+ tokens (budget 200K) with zero compactions. Manual compaction required.
Log audit of gateway.log (full history): prior to the fix, every auto-compaction event was for Sonnet. Every Opus compaction event was timeout-compaction (overflow recovery).
On Apr 13, an Opus session hit context overflow and threw 15 consecutive "prompt too large" errors in rapid succession without recovery.
After lowering agents.defaults.contextTokens to 150K and starting a fresh session: auto-compaction eventually fired at ~10:33 EDT on Apr 18, confirmed in gateway.log: auto-compaction succeeded for anthropic/claude-opus-4-6; retrying prompt
The session /status initially showed "Compactions: 1" which was a carried-forward artifact from a prior session summary, not a new event. Real compaction count reached 2 after the auto-compaction fired.

Expected behavior

Preemptive compaction check should fire reliably regardless of model, and the decision path should be observable in logs.

Source trace / hypothesis

Traced the call site in pi-embedded-runner:

Preemptive check invoked at approx line 6654/6655
Default context engine is LegacyContextEngine (no ownsCompaction property) → bypass does NOT apply
contextTokenBudget defaults to 2e5 (200K) if not explicitly passed
shouldPreemptivelyCompactBeforePrompt uses estimatePrePromptTokens (character-based: estimatedChars / 4) compared against contextTokenBudget - reserveTokens

Leading hypothesis: contextPruning.mode: "cache-ttl" aggressively removes tool-result content from messages before estimatePrePromptTokens runs. The character-based estimate sees the pruned message set and stays well below the budget threshold, so the preemptive check returns route: "fits" indefinitely at the default 200K budget. At the reduced 150K budget, the margin is tight enough that the estimate eventually catches up.

This explains why Sonnet sessions compact correctly (different token accumulation pattern, more frequent tool calls, hit threshold between prune cycles) while Opus sessions at 200K do not (larger headroom, pruning consistently masks the estimate).

Pruning-disabled test pending to confirm this hypothesis.

Observability gap

The [context-overflow-precheck] log string is emitted ONLY when shouldCompact === true or when truncation routes fire. There is no log when route === "fits" — meaning the decision to NOT compact is completely silent. This made the bug extremely difficult to diagnose.

Request: Add a debug-level log inside shouldPreemptivelyCompactBeforePrompt (or at the call site) showing estimatedPromptTokens, promptBudgetBeforeReserve, overflowTokens, and route on every evaluation. This would make token budget issues trivially diagnosable.

Additional observations

Session budget is resolved once at session start and cached for the session lifetime. Config changes to contextTokens mid-session have no effect; a fresh session is required.
Possibly related: #56072 (daily session reset archiving without memory flush). Our install exhibits .reset files appearing at ~3-4 AM daily for large sessions.

Minimal reproduction

Main agent model set to anthropic/claude-opus-4-6
contextPruning.mode: "cache-ttl" enabled (30min TTL)
Large bootstrap files (AGENTS.md, MEMORY.md etc.) totaling ~70K tokens at session start
contextTokens at default (200K) or high value
Run a long session with periodic tool use; let context accumulate
Observe: no auto-compaction log entry at default budget; session requires manual compaction or hits overflow

Willing to provide

Config (redacted)
gateway.log excerpts showing Sonnet auto-compactions and Opus timeout-compactions
Source-trace notes

extent analysis

TL;DR

Lower the agents.defaults.contextTokens budget or adjust the contextPruning.mode to ensure preemptive compaction checks trigger reliably.

Guidance

Verify the hypothesis: Test with contextPruning.mode disabled to confirm if aggressive pruning is causing the issue.
Adjust the budget: Lower agents.defaults.contextTokens to a value where preemptive compaction triggers, as seen in the issue when it was set to 150K.
Add logging for observability: Implement a debug-level log in shouldPreemptivelyCompactBeforePrompt to show estimatedPromptTokens, promptBudgetBeforeReserve, overflowTokens, and route for easier diagnosis.
Monitor session behavior: Observe sessions with large bootstrap files and periodic tool use to ensure auto-compaction triggers as expected.

Example

No code snippet is provided due to the complexity of the issue and the need for a thorough understanding of the pi-embedded-runner and LegacyContextEngine.

Notes

The provided steps are based on the information given and may require adjustments based on the specific implementation and requirements of the system. The issue seems to stem from the interaction between contextPruning.mode and the preemptive compaction check, particularly with the anthropic/claude-opus-4-6 model.

Recommendation

Apply a workaround by adjusting the agents.defaults.contextTokens budget to a lower value, such as 150K, to ensure preemptive compaction checks trigger reliably until a more permanent fix can be implemented. This is based on the observation that lowering the budget allowed the preemptive compaction to eventually fire.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Preemptive compaction check should fire reliably regardless of model, and the decision path should be observable in logs.

#api #embedding generation #cache error #pipeline error #runtime error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Auto-compaction preemptive check: no debug logging when route='fits'; cache-ttl pruning may mask estimate for large-context models [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Environment

Relevant config

Summary

Observed behavior

Expected behavior

Source trace / hypothesis

Observability gap

Additional observations

Minimal reproduction

Willing to provide

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Auto-compaction preemptive check: no debug logging when route='fits'; cache-ttl pruning may mask estimate for large-context models [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Environment

Relevant config

Summary

Observed behavior

Expected behavior

Source trace / hypothesis

Observability gap

Additional observations

Minimal reproduction

Willing to provide

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING