openclaw - ✅(Solved) Fix [Bug] Auto-compaction never fires when Anthropic prompt cache hit rate is ~100% [1 pull requests, 1 participants]

Neomail2 · 2026-04-14T11:48:41Z

[openclaw] Auto-compaction does not trigger automatically even when the context window is exceeded observed at 153% = 305k/200k , as long as the Anthropic prom… Auto-compaction does not trigger automatically even when the context window is exceeded (observed at 153% = 305k/200k), as long as the Anthropic prompt cache absorbs most tokens. # PR #66716: fix: auto-compaction fires on fresh cached token counts (#66520) - Repository: openclaw/openclaw - Author: KeWang0622 - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/66716 ## Description (problem / solution / changelog) ## Summary - **Bug**: `runPreflightCompactionIfNeeded` returned early when `totalTokensFresh === true` without checking the compaction threshold, so auto-compaction never triggered for sessions with fresh token counts — even at 153% of the context window - **Root cause**: The early-return optimization (skip transcript estimation when fresh data is available) accidentally bypassed the threshold comparison entirely - **Fix**: Restructure token resolution so fresh persisted totals (which include `cacheRead` from Anthropic prompt caching) are projected forward and checked against `contextWindow - reserveTokens - softThreshold` before deciding whether to compact ## Details When Anthropic's prompt cache absorbs nearly all tokens (100% hit rate), `derivePromptTokens` correctly computes `input + cacheRead + cacheWrite` (e.g. 99 + 305,000 + 0 = 305,099), and this is persisted as `totalTokens` with `totalTokensFresh: true`. However, `runPreflightCompactionIfNeeded` had this logic: ```typescript if (!shouldUseTranscriptFallback) { return entry; // BUG: skips threshold check entirely } ``` The fix moves the threshold check into both branches (fresh and stale), using the fresh persisted value directly when available. ## Edge cases verified - **100% cache hit (Anthropic)**: totalTokens=305k, contextWindow=200k — compaction now fires - **0% cache hit**: Stale tokens fall through to transcript estimation path — behavior unchanged - **Partial cache**: 180k total with 166k threshold — fires correctly - **Below threshold**: 50k total with 166k threshold — no compaction, as expected - **Heartbeat/CLI**: Still skipped regardless of token count - **Non-Anthropic providers**: No change — providers without caching have `cacheRead=0`, so `totalTokens` is just `input + cacheWrite`, same as before ## Test plan - [x] `shouldRunPreflightCompaction` unit tests: 100% cache hit triggers, below threshold skips, partial cache boundary correct - [x] `runPreflightCompactionIfNeeded` integration tests: fresh tokens above threshold trigger compaction, below threshold skip, stale tokens use transcript fallback, heartbeat skips - [x] All existing compaction tests pass unchanged - [x] Session usage persistence tests pass (65 tests) - [x] Followup runner tests pass (19 tests) - [x] Preemptive compaction tests pass (9 tests) Fixes #66520 🤖 Generated with [Claude Code](https://claude.com/claude-code) ## Changed files - `src/auto-reply/reply/agent-runner-memory.test.ts` (modified, +298/-1) - `src/auto-reply/reply/agent-runner-memory.ts` (modified, +67/-22) - `src/auto-reply/reply/reply-state.test.ts` (modified, +51/-0) ## Workaround Manual `/compact` command. Enabling `compaction.notifyUser: true` has no effect since compaction never triggers. ## Summary Auto-compaction does not trigger automatically even when the context window is exceeded (observed at 153% = 305k/200k), as long as the Anthropic prompt cache absorbs most tokens. ## Environment - OpenClaw version: 2026.4.12 (1c0672b) - Model: `anthropic/claude-sonnet-4-6` - Compaction mode: `safeguard` - `reserveTokensFloor`: 30000 ## Observed behavior `/status` output at time of issue: ``` Model: anthropic/claude-sonnet-4-6 Tokens: 3 in / 140 out Cache: 100% hit · 305k cached, 99 new Context: 305k/200k (153%) · Compactions: 0 ``` Context was at **153% of the model window** (305k > 200k) with **0 compactions**. Auto-compaction should have fired at the `contextWindow - reserveTokens` threshold (≈170k), but never did. The user had to run `/compact` manually. ## Root cause hypothesis Pi's compaction threshold check appears to use the **new tokens** counter (99 in this case) rather than the **total context size** (305k). When Anthropic's prompt cache absorbs nearly all tokens (100% hit rate), the incoming token count seen by Pi's runtime is near zero, so the threshold `contextTokens > contextWindow - reserveTokens` is never satisfied — even though the actual context far exceeds the window. In other words: prompt cache efficiency masks the true context load from the Pi compaction engine. ## Expected behavior Auto-compaction should trigger based on **total context window usage** (305k/200k), not on the count of non-cached input tokens for the current turn. ## Workaround Manual `/compact` command. Enabling `compaction.notifyUser: true` has no effect since compaction never triggers. ## Steps to reproduce 1. Start a

openclaw2026-04-14 11:48:41

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#66520•Fetched 2026-04-15 06:25:53

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Neomail2

Participants

Neomail2

Timeline (top)

cross-referenced ×1referenced ×1

Auto-compaction does not trigger automatically even when the context window is exceeded (observed at 153% = 305k/200k), as long as the Anthropic prompt cache absorbs most tokens.

Error Message

/status output at time of issue:

Model: anthropic/claude-sonnet-4-6
Tokens: 3 in / 140 out
Cache: 100% hit · 305k cached, 99 new
Context: 305k/200k (153%) · Compactions: 0

Root Cause

Root cause hypothesis

Fix Action

Workaround

Manual /compact command. Enabling compaction.notifyUser: true has no effect since compaction never triggers.

PR fix notes

PR #66716: fix: auto-compaction fires on fresh cached token counts (#66520)

Repository: openclaw/openclaw
Author: KeWang0622
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/66716

Description (problem / solution / changelog)

Summary

Bug: runPreflightCompactionIfNeeded returned early when totalTokensFresh === true without checking the compaction threshold, so auto-compaction never triggered for sessions with fresh token counts — even at 153% of the context window
Root cause: The early-return optimization (skip transcript estimation when fresh data is available) accidentally bypassed the threshold comparison entirely
Fix: Restructure token resolution so fresh persisted totals (which include cacheRead from Anthropic prompt caching) are projected forward and checked against contextWindow - reserveTokens - softThreshold before deciding whether to compact

Details

When Anthropic's prompt cache absorbs nearly all tokens (100% hit rate), derivePromptTokens correctly computes input + cacheRead + cacheWrite (e.g. 99 + 305,000 + 0 = 305,099), and this is persisted as totalTokens with totalTokensFresh: true. However, runPreflightCompactionIfNeeded had this logic:

if (!shouldUseTranscriptFallback) {
    return entry;  // BUG: skips threshold check entirely
}

The fix moves the threshold check into both branches (fresh and stale), using the fresh persisted value directly when available.

Edge cases verified

100% cache hit (Anthropic): totalTokens=305k, contextWindow=200k — compaction now fires
0% cache hit: Stale tokens fall through to transcript estimation path — behavior unchanged
Partial cache: 180k total with 166k threshold — fires correctly
Below threshold: 50k total with 166k threshold — no compaction, as expected
Heartbeat/CLI: Still skipped regardless of token count
Non-Anthropic providers: No change — providers without caching have cacheRead=0, so totalTokens is just input + cacheWrite, same as before

Test plan

shouldRunPreflightCompaction unit tests: 100% cache hit triggers, below threshold skips, partial cache boundary correct
runPreflightCompactionIfNeeded integration tests: fresh tokens above threshold trigger compaction, below threshold skip, stale tokens use transcript fallback, heartbeat skips
All existing compaction tests pass unchanged
Session usage persistence tests pass (65 tests)
Followup runner tests pass (19 tests)
Preemptive compaction tests pass (9 tests)

Fixes #66520

🤖 Generated with Claude Code

Changed files

src/auto-reply/reply/agent-runner-memory.test.ts (modified, +298/-1)
src/auto-reply/reply/agent-runner-memory.ts (modified, +67/-22)
src/auto-reply/reply/reply-state.test.ts (modified, +51/-0)

Code Example

Model: anthropic/claude-sonnet-4-6
Tokens: 3 in / 140 out
Cache: 100% hit · 305k cached, 99 new
Context: 305k/200k (153%) · Compactions: 0

RAW_BUFFERClick to expand / collapse

Summary

Auto-compaction does not trigger automatically even when the context window is exceeded (observed at 153% = 305k/200k), as long as the Anthropic prompt cache absorbs most tokens.

Environment

OpenClaw version: 2026.4.12 (1c0672b)
Model: anthropic/claude-sonnet-4-6
Compaction mode: safeguard
reserveTokensFloor: 30000

Observed behavior

/status output at time of issue:

Model: anthropic/claude-sonnet-4-6
Tokens: 3 in / 140 out
Cache: 100% hit · 305k cached, 99 new
Context: 305k/200k (153%) · Compactions: 0

Context was at 153% of the model window (305k > 200k) with 0 compactions. Auto-compaction should have fired at the contextWindow - reserveTokens threshold (≈170k), but never did. The user had to run /compact manually.

Root cause hypothesis

Pi's compaction threshold check appears to use the new tokens counter (99 in this case) rather than the total context size (305k). When Anthropic's prompt cache absorbs nearly all tokens (100% hit rate), the incoming token count seen by Pi's runtime is near zero, so the threshold contextTokens > contextWindow - reserveTokens is never satisfied — even though the actual context far exceeds the window.

In other words: prompt cache efficiency masks the true context load from the Pi compaction engine.

Expected behavior

Auto-compaction should trigger based on total context window usage (305k/200k), not on the count of non-cached input tokens for the current turn.

Workaround

Manual /compact command. Enabling compaction.notifyUser: true has no effect since compaction never triggers.

Steps to reproduce

Start a long session with anthropic/claude-sonnet-4-6 and compaction.mode: safeguard
Let the session grow until prompt cache hits approach 100%
Continue the session past the context window limit (200k tokens)
Observe: /status shows Context > 100% with Compactions: 0
Auto-compaction never fires — manual /compact required

extent analysis

TL;DR

The most likely fix is to modify the compaction threshold check to use the total context size instead of the new tokens counter.

Guidance

Review the compaction threshold check logic to ensure it accounts for the total context size, including cached tokens.
Verify that the contextWindow and reserveTokens values are correctly configured and applied in the compaction threshold calculation.
Consider adding logging or monitoring to track the total context size and compaction triggers to better understand the issue.
Test the workaround of running the manual /compact command to ensure it effectively reduces the context size.

Example

No code snippet is provided as the issue does not include specific code details.

Notes

The root cause hypothesis suggests that the prompt cache efficiency is masking the true context load from the Pi compaction engine, leading to the auto-compaction not triggering as expected.

Recommendation

Apply the workaround of running the manual /compact command until the compaction threshold check logic can be modified to use the total context size. This will ensure that the context size is managed effectively, even if the auto-compaction is not triggering as expected.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Auto-compaction should trigger based on total context window usage (305k/200k), not on the count of non-cached input tokens for the current turn.

#model loading #dependency error #configuration error #environment variable #network issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug] Auto-compaction never fires when Anthropic prompt cache hit rate is ~100% [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause hypothesis

Fix Action

Workaround

PR fix notes

PR #66716: fix: auto-compaction fires on fresh cached token counts (#66520)

Description (problem / solution / changelog)

Summary

Details

Edge cases verified

Test plan

Changed files

Code Example

Summary

Environment

Observed behavior

Root cause hypothesis

Expected behavior

Workaround

Steps to reproduce

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING