openclaw - ✅(Solved) Fix [Bug] Auto-compaction never fires when Anthropic prompt cache hit rate is ~100% [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#66520Fetched 2026-04-15 06:25:53
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1referenced ×1

Auto-compaction does not trigger automatically even when the context window is exceeded (observed at 153% = 305k/200k), as long as the Anthropic prompt cache absorbs most tokens.

Error Message

/status output at time of issue:

Model: anthropic/claude-sonnet-4-6
Tokens: 3 in / 140 out
Cache: 100% hit · 305k cached, 99 new
Context: 305k/200k (153%) · Compactions: 0

Root Cause

Root cause hypothesis

Fix Action

Workaround

Manual /compact command. Enabling compaction.notifyUser: true has no effect since compaction never triggers.

PR fix notes

PR #66716: fix: auto-compaction fires on fresh cached token counts (#66520)

Description (problem / solution / changelog)

Summary

  • Bug: runPreflightCompactionIfNeeded returned early when totalTokensFresh === true without checking the compaction threshold, so auto-compaction never triggered for sessions with fresh token counts — even at 153% of the context window
  • Root cause: The early-return optimization (skip transcript estimation when fresh data is available) accidentally bypassed the threshold comparison entirely
  • Fix: Restructure token resolution so fresh persisted totals (which include cacheRead from Anthropic prompt caching) are projected forward and checked against contextWindow - reserveTokens - softThreshold before deciding whether to compact

Details

When Anthropic's prompt cache absorbs nearly all tokens (100% hit rate), derivePromptTokens correctly computes input + cacheRead + cacheWrite (e.g. 99 + 305,000 + 0 = 305,099), and this is persisted as totalTokens with totalTokensFresh: true. However, runPreflightCompactionIfNeeded had this logic:

if (!shouldUseTranscriptFallback) {
    return entry;  // BUG: skips threshold check entirely
}

The fix moves the threshold check into both branches (fresh and stale), using the fresh persisted value directly when available.

Edge cases verified

  • 100% cache hit (Anthropic): totalTokens=305k, contextWindow=200k — compaction now fires
  • 0% cache hit: Stale tokens fall through to transcript estimation path — behavior unchanged
  • Partial cache: 180k total with 166k threshold — fires correctly
  • Below threshold: 50k total with 166k threshold — no compaction, as expected
  • Heartbeat/CLI: Still skipped regardless of token count
  • Non-Anthropic providers: No change — providers without caching have cacheRead=0, so totalTokens is just input + cacheWrite, same as before

Test plan

  • shouldRunPreflightCompaction unit tests: 100% cache hit triggers, below threshold skips, partial cache boundary correct
  • runPreflightCompactionIfNeeded integration tests: fresh tokens above threshold trigger compaction, below threshold skip, stale tokens use transcript fallback, heartbeat skips
  • All existing compaction tests pass unchanged
  • Session usage persistence tests pass (65 tests)
  • Followup runner tests pass (19 tests)
  • Preemptive compaction tests pass (9 tests)

Fixes #66520

🤖 Generated with Claude Code

Changed files

  • src/auto-reply/reply/agent-runner-memory.test.ts (modified, +298/-1)
  • src/auto-reply/reply/agent-runner-memory.ts (modified, +67/-22)
  • src/auto-reply/reply/reply-state.test.ts (modified, +51/-0)

Code Example

Model: anthropic/claude-sonnet-4-6
Tokens: 3 in / 140 out
Cache: 100% hit · 305k cached, 99 new
Context: 305k/200k (153%) · Compactions: 0
RAW_BUFFERClick to expand / collapse

Summary

Auto-compaction does not trigger automatically even when the context window is exceeded (observed at 153% = 305k/200k), as long as the Anthropic prompt cache absorbs most tokens.

Environment

  • OpenClaw version: 2026.4.12 (1c0672b)
  • Model: anthropic/claude-sonnet-4-6
  • Compaction mode: safeguard
  • reserveTokensFloor: 30000

Observed behavior

/status output at time of issue:

Model: anthropic/claude-sonnet-4-6
Tokens: 3 in / 140 out
Cache: 100% hit · 305k cached, 99 new
Context: 305k/200k (153%) · Compactions: 0

Context was at 153% of the model window (305k > 200k) with 0 compactions. Auto-compaction should have fired at the contextWindow - reserveTokens threshold (≈170k), but never did. The user had to run /compact manually.

Root cause hypothesis

Pi's compaction threshold check appears to use the new tokens counter (99 in this case) rather than the total context size (305k). When Anthropic's prompt cache absorbs nearly all tokens (100% hit rate), the incoming token count seen by Pi's runtime is near zero, so the threshold contextTokens > contextWindow - reserveTokens is never satisfied — even though the actual context far exceeds the window.

In other words: prompt cache efficiency masks the true context load from the Pi compaction engine.

Expected behavior

Auto-compaction should trigger based on total context window usage (305k/200k), not on the count of non-cached input tokens for the current turn.

Workaround

Manual /compact command. Enabling compaction.notifyUser: true has no effect since compaction never triggers.

Steps to reproduce

  1. Start a long session with anthropic/claude-sonnet-4-6 and compaction.mode: safeguard
  2. Let the session grow until prompt cache hits approach 100%
  3. Continue the session past the context window limit (200k tokens)
  4. Observe: /status shows Context > 100% with Compactions: 0
  5. Auto-compaction never fires — manual /compact required

extent analysis

TL;DR

The most likely fix is to modify the compaction threshold check to use the total context size instead of the new tokens counter.

Guidance

  • Review the compaction threshold check logic to ensure it accounts for the total context size, including cached tokens.
  • Verify that the contextWindow and reserveTokens values are correctly configured and applied in the compaction threshold calculation.
  • Consider adding logging or monitoring to track the total context size and compaction triggers to better understand the issue.
  • Test the workaround of running the manual /compact command to ensure it effectively reduces the context size.

Example

No code snippet is provided as the issue does not include specific code details.

Notes

The root cause hypothesis suggests that the prompt cache efficiency is masking the true context load from the Pi compaction engine, leading to the auto-compaction not triggering as expected.

Recommendation

Apply the workaround of running the manual /compact command until the compaction threshold check logic can be modified to use the total context size. This will ensure that the context size is managed effectively, even if the auto-compaction is not triggering as expected.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Auto-compaction should trigger based on total context window usage (305k/200k), not on the count of non-cached input tokens for the current turn.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug] Auto-compaction never fires when Anthropic prompt cache hit rate is ~100% [1 pull requests, 1 participants]