openclaw - ✅(Solved) Fix Add Anthropic native compaction and mixed prompt-cache TTL support [3 pull requests, 1 participants]

100yenadmin · 2026-04-12T09:57:44Z

[openclaw] Anthropic prompt caching is exact-prefix and cache-invalidating on prompt-prefix rewrites, so local prompt-mutating compaction can burn a freshly wr… Anthropic prompt caching is exact-prefix and cache-invalidating on prompt-prefix rewrites, so local prompt-mutating compaction can burn a freshly written cache even when `lossless-claw` is no longer stalling the foreground turn. OpenClaw already has a provider-native compaction precedent for OpenAI Responses; Anthropic should get the same first-class support, plus mixed prompt-cache TTLs that keep the expensive stable prefix warm without paying 1-hour write costs for high-churn conversation content. # PR #65288: Add Anthropic native compaction and mixed cache TTLs - Repository: openclaw/openclaw - Author: 100yenadmin - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/65288 ## Description (problem / solution / changelog) ## Summary Closes #65287. This adds Anthropic-native compaction support and a mixed Anthropic cache-retention policy that uses `1h` retention for the stable system/tool/workspace prefix and `5m` retention for high-churn conversation content. It also round-trips Anthropic compaction blocks through request history and streaming so provider-native active-context compaction can manage the live Claude prompt without giving up durable searchable memory in `lossless-claw`. ## What Changed - Added Anthropic run/model params: - `anthropicServerCompaction` - `anthropicCompactThreshold` - `anthropicCompactPauseAfter` - `anthropicCompactInstructions` - Injected `context_management.edits` with `compact_20260112` when Anthropic native compaction is enabled. - Added the required Anthropic beta features only when native compaction is active or compaction blocks must be round-tripped. - Extended Anthropic history conversion and streaming to preserve `compaction` blocks. - Updated Anthropic cache policy so long retention keeps the stable prefix at `1h` while trailing conversation content remains `5m`/ephemeral. ## Why - Anthropic prompt caching is exact-prefix, so local prompt rewrites can burn a hot conversation cache. - Long-running Claude coding sessions benefit from a stable `1h` cached prefix, but paying `1h` write cost for fast-changing conversation blocks is wasteful. - This pairs well with the `lossless-claw` deferred/background compaction work: provider-native compaction manages the active prompt, while `lossless-claw` stays the searchable lossless sidecar. ## Validation - `pnpm test -- --run src/agents/anthropic-payload-policy.test.ts src/agents/anthropic-transport-stream.test.ts src/agents/pi-embedded-runner-extraparams.test.ts` passed - `NODE_OPTIONS=--max-old-space-size=8192 pnpm exec tsc -p tsconfig.json --noEmit` reports only the pre-existing unrelated Telegram `AbortSignal` baseline in `extensions/telegram/src/bot.ts` ## Related Work - openclaw/openclaw#65233 - Martian-Engineering/lossless-claw#407 - Martian-Engineering/lossless-claw#408 ## Changed files - `src/agents/anthropic-payload-policy.test.ts` (modified, +79/-2) - `src/agents/anthropic-payload-policy.ts` (modified, +117/-6) - `src/agents/anthropic-transport-stream.test.ts` (modified, +384/-0) - `src/agents/anthropic-transport-stream.ts` (modified, +208/-43) - `src/agents/pi-embedded-runner-extraparams.test.ts` (modified, +32/-0) - `src/agents/pi-embedded-runner/extra-params.ts` (modified, +22/-0) - `src/agents/pi-embedded-runner/run.incomplete-turn.test.ts` (modified, +62/-0) - `src/agents/pi-embedded-runner/run.ts` (modified, +22/-0) - `src/agents/pi-embedded-runner/run/incomplete-turn.ts` (modified, +15/-1) - `src/agents/transport-message-transform.ts` (modified, +9/-1) --- # PR #408: Defer proactive compaction debt and surface LCM maintenance state - Repository: Martian-Engineering/lossless-claw - Author: 100yenadmin - State: open | merged: False - Link: https://github.com/Martian-Engineering/lossless-claw/pull/408 ## Description (problem / solution / changelog) ## Summary Closes #407. This changes proactive compaction from inline turn work into deferred maintenance debt by default, then makes that deferred path cache-safe for Anthropic. The result is a hybrid model: proactive compaction no longer blocks the foreground `afterTurn()` path, and prompt-mutating deferred compaction no longer rewrites a still-hot Anthropic cache. ## Why - Inline proactive compaction can keep the main session lane busy after the assistant reply is already visible. - That starves immediate follow-up turns and can push subagent completion announce traffic into timeouts. - Anthropic prompt caching is exact-prefix, so rewriting the active prompt too soon can burn a freshly written cache and drive up cost. - Orchestrator-heavy sessions make the pain worse: one main agent plus up to 4 subagents may all need fast LCM reads while one hot session is still writing. ## What Changed - Added `proactiveThresholdCompactionMode: "deferred" | "inline"`; default is `"deferred"`. - `afte

openclaw2026-04-12 09:57:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#65287•Fetched 2026-04-12 13:24:55

View on GitHub

Comments

Participants

Timeline

Reactions

Author

100yenadmin

Participants

100yenadmin

Timeline (top)

cross-referenced ×3

Anthropic prompt caching is exact-prefix and cache-invalidating on prompt-prefix rewrites, so local prompt-mutating compaction can burn a freshly written cache even when lossless-claw is no longer stalling the foreground turn. OpenClaw already has a provider-native compaction precedent for OpenAI Responses; Anthropic should get the same first-class support, plus mixed prompt-cache TTLs that keep the expensive stable prefix warm without paying 1-hour write costs for high-churn conversation content.

Root Cause

It reduces wasted cache-write spend on volatile conversation content.
It lets provider-native compaction manage the active Claude prompt while lossless-claw continues to provide searchable durable memory.
It preserves cache value for long coding sessions where follow-up turns are frequent.

Fix Action

Fixed

Fixed by PR: compaction + caches: add Anthropic native compaction and mixed cache TTLs (https://github.com/openclaw/openclaw/pull/65288)
Fixed by PR: Defer proactive compaction debt and surface LCM maintenance state (https://github.com/Martian-Engineering/lossless-claw/pull/408)
Fixed by PR: Run context-engine turn maintenance as idle-aware background work (https://github.com/openclaw/openclaw/pull/65233)

PR fix notes

PR #65288: Add Anthropic native compaction and mixed cache TTLs

Repository: openclaw/openclaw
Author: 100yenadmin
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/65288

Description (problem / solution / changelog)

Summary

Closes #65287.

This adds Anthropic-native compaction support and a mixed Anthropic cache-retention policy that uses 1h retention for the stable system/tool/workspace prefix and 5m retention for high-churn conversation content. It also round-trips Anthropic compaction blocks through request history and streaming so provider-native active-context compaction can manage the live Claude prompt without giving up durable searchable memory in lossless-claw.

What Changed

Added Anthropic run/model params:
- anthropicServerCompaction
- anthropicCompactThreshold
- anthropicCompactPauseAfter
- anthropicCompactInstructions
Injected context_management.edits with compact_20260112 when Anthropic native compaction is enabled.
Added the required Anthropic beta features only when native compaction is active or compaction blocks must be round-tripped.
Extended Anthropic history conversion and streaming to preserve compaction blocks.
Updated Anthropic cache policy so long retention keeps the stable prefix at 1h while trailing conversation content remains 5m/ephemeral.

Why

Anthropic prompt caching is exact-prefix, so local prompt rewrites can burn a hot conversation cache.
Long-running Claude coding sessions benefit from a stable 1h cached prefix, but paying 1h write cost for fast-changing conversation blocks is wasteful.
This pairs well with the lossless-claw deferred/background compaction work: provider-native compaction manages the active prompt, while lossless-claw stays the searchable lossless sidecar.

Validation

pnpm test -- --run src/agents/anthropic-payload-policy.test.ts src/agents/anthropic-transport-stream.test.ts src/agents/pi-embedded-runner-extraparams.test.ts passed
NODE_OPTIONS=--max-old-space-size=8192 pnpm exec tsc -p tsconfig.json --noEmit reports only the pre-existing unrelated Telegram AbortSignal baseline in extensions/telegram/src/bot.ts

Related Work

openclaw/openclaw#65233
Martian-Engineering/lossless-claw#407
Martian-Engineering/lossless-claw#408

Changed files

src/agents/anthropic-payload-policy.test.ts (modified, +79/-2)
src/agents/anthropic-payload-policy.ts (modified, +117/-6)
src/agents/anthropic-transport-stream.test.ts (modified, +384/-0)
src/agents/anthropic-transport-stream.ts (modified, +208/-43)
src/agents/pi-embedded-runner-extraparams.test.ts (modified, +32/-0)
src/agents/pi-embedded-runner/extra-params.ts (modified, +22/-0)
src/agents/pi-embedded-runner/run.incomplete-turn.test.ts (modified, +62/-0)
src/agents/pi-embedded-runner/run.ts (modified, +22/-0)
src/agents/pi-embedded-runner/run/incomplete-turn.ts (modified, +15/-1)
src/agents/transport-message-transform.ts (modified, +9/-1)

PR #408: Defer proactive compaction debt and surface LCM maintenance state

Repository: Martian-Engineering/lossless-claw
Author: 100yenadmin
State: open | merged: False
Link: https://github.com/Martian-Engineering/lossless-claw/pull/408

Description (problem / solution / changelog)

Summary

Closes #407.

This changes proactive compaction from inline turn work into deferred maintenance debt by default, then makes that deferred path cache-safe for Anthropic. The result is a hybrid model: proactive compaction no longer blocks the foreground afterTurn() path, and prompt-mutating deferred compaction no longer rewrites a still-hot Anthropic cache.

Why

Inline proactive compaction can keep the main session lane busy after the assistant reply is already visible.
That starves immediate follow-up turns and can push subagent completion announce traffic into timeouts.
Anthropic prompt caching is exact-prefix, so rewriting the active prompt too soon can burn a freshly written cache and drive up cost.
Orchestrator-heavy sessions make the pain worse: one main agent plus up to 4 subagents may all need fast LCM reads while one hot session is still writing.

What Changed

Added proactiveThresholdCompactionMode: "deferred" | "inline"; default is "deferred".
afterTurn() now records coalesced proactive compaction debt per conversation/session instead of running proactive threshold or leaf compaction inline.
maintain() now consumes that debt only when runtime context explicitly allows deferred execution, and it skips prompt-mutating deferred compaction while Anthropic cache is still hot.
assemble() now consumes deferred prompt-mutating compaction pre-assembly when the cache is cold or the prompt is approaching overflow.
Added persistent maintenance state for pending/running/last success/last failure metadata.
Added cacheAwareCompaction.cacheTTLSeconds (default 300) and telemetry for lastApiCallAt, lastCacheTouchAt, provider, and model.
Extended LCM status/command output and startup banners to surface maintenance state and cache-aware compaction context.
Preserved the local same-turn safety guard in legacy inline mode so leaf and threshold compaction do not both fire on the same turn.
Kept manual, overflow, and timeout compaction synchronous.
Advertised turnMaintenanceMode: "background" for the companion OpenClaw host change.

Safety

Legacy inline mode remains available as a rollback/debug escape hatch.
Read-only LCM tools continue to work while compaction debt is pending.
Public tool inputs stay stable apart from the new config option.
Anthropic-active sessions keep their hot cache unless compaction is needed for correctness or the cache has already gone cold.

Validation

npm test -> 40 files / 709 tests passed
npm run build passed

Companion Host Work

Companion OpenClaw issue: openclaw/openclaw#65220
Companion OpenClaw PR: openclaw/openclaw#65233
Anthropic-native companion issue: openclaw/openclaw#65287
Anthropic-native companion PR: openclaw/openclaw#65288

Changed files

.changeset/deferred-compaction-maintenance.md (added, +5/-0)
README.md (modified, +9/-0)
docs/configuration.md (modified, +15/-0)
openclaw.plugin.json (modified, +19/-0)
skills/lossless-claw/references/config.md (modified, +25/-0)
src/db/config.ts (modified, +23/-0)
src/db/migration.ts (modified, +34/-0)
src/engine.ts (modified, +774/-375)
src/plugin/index.ts (modified, +6/-1)
src/plugin/lcm-command.ts (modified, +75/-3)
src/startup-banner-log.ts (modified, +1/-0)
src/store/compaction-maintenance-store.ts (added, +219/-0)
src/store/compaction-telemetry-store.ts (modified, +33/-1)
src/store/index.ts (modified, +5/-0)
test/bootstrap-flood-regression.test.ts (modified, +1/-0)
test/circuit-breaker.test.ts (modified, +1/-0)
test/config.test.ts (modified, +25/-0)
test/engine.test.ts (modified, +391/-27)
test/expansion.test.ts (modified, +1/-0)
test/lcm-command.test.ts (modified, +52/-0)
test/lcm-expand-query-tool.test.ts (modified, +1/-0)
test/lcm-expand-tool.test.ts (modified, +1/-0)
test/lcm-tools.test.ts (modified, +1/-0)
test/plugin-config-registration.test.ts (modified, +17/-3)
test/session-operation-queues.test.ts (modified, +1/-0)

PR #65233: Run context-engine turn maintenance as idle-aware background work

Repository: openclaw/openclaw
Author: 100yenadmin
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/65233

Description (problem / solution / changelog)

Summary

Closes #65220.

This teaches context engines to opt into background turn maintenance and moves turn-triggered maintenance off the foreground session lane when turnMaintenanceMode === "background". For lossless-claw, that means proactive post-turn maintenance can be queued as a hidden task instead of blocking the next user turn.

Why

Today the host awaits both afterTurn() and maintain(... reason: "turn") inline on the session path.
Even if a plugin removes heavy foreground work, the host still controls whether turn maintenance blocks the lane.
The result is visible assistant output followed by dead time before the next command can start.
Subagent completion announce traffic can time out behind a stalled main lane.

What Changed

Added turnMaintenanceMode?: "foreground" | "background" to context-engine info.
Added allowDeferredCompactionExecution?: boolean to runtime context.
When a context engine opts into background mode, runContextEngineMaintenance(... reason: "turn") now:
- queues one hidden maintenance task per session key
- coalesces repeated requests
- waits for the session lane to go idle before executing
- runs maintenance with allowDeferredCompactionExecution: true
- uses task delivery/status infrastructure instead of a visible subagent or announce flow
Long-running or failed deferred maintenance can surface subtle task updates to the session.
Fast deferred maintenance stays silent.

Guarantees

Proactive turn maintenance no longer blocks the immediate next foreground turn for engines that opt in.
Overflow/timeout/manual compaction remains synchronous in the plugin/runtime paths that require correctness.
Hidden maintenance is not a child agent run and does not create synthetic subagent sessions.

Validation

pnpm test -- --run src/agents/pi-embedded-runner/context-engine-maintenance.test.ts src/infra/backoff.test.ts passed
NODE_OPTIONS=--max-old-space-size=8192 pnpm exec tsc -p tsconfig.json --noEmit still reports preexisting unrelated AbortSignal type errors in unchanged extensions/telegram/src/bot.ts

Companion Plugin Work

Companion lossless-claw issue: Martian-Engineering/lossless-claw#407
Companion lossless-claw PR: Martian-Engineering/lossless-claw#408

Changed files

src/agents/pi-embedded-runner/context-engine-maintenance.test.ts (modified, +1013/-3)
src/agents/pi-embedded-runner/context-engine-maintenance.ts (modified, +566/-21)
src/context-engine/types.ts (modified, +12/-0)
src/infra/backoff.test.ts (modified, +35/-0)
src/infra/backoff.ts (modified, +40/-9)

RAW_BUFFERClick to expand / collapse

Summary

Problem

Anthropic-heavy coding sessions pay large prompt-cache write costs on Opus/Claude when long retention is applied too broadly.
Local prompt rewrites after a turn can invalidate a still-hot conversation cache, reducing the value of lossless-claw compaction even after the no-stall/background-maintenance work lands.
Right now OpenClaw does not expose Anthropic native compaction or a mixed TTL policy that matches Anthropic's cache model.

Design Goal

Use Anthropic native active-context compaction when the provider supports it.
Preserve stable system/tool/workspace prefixes with 1h cache retention.
Keep high-churn conversation/trailing user prefixes at 5m retention by default.
Round-trip Anthropic compaction blocks through the transport so follow-up requests can reuse the provider-native compacted prefix.
Keep this separate from lossless-claw, which should stay the lossless searchable sidecar rather than the primary live prompt shaper.

Proposed Changes

Add Anthropic request params:
- anthropicServerCompaction: boolean
- anthropicCompactThreshold: number
- anthropicCompactPauseAfter?: boolean
- anthropicCompactInstructions?: string
In the Anthropic transport path:
- inject context_management.edits with compact_20260112 when enabled
- add the required Anthropic beta features
- preserve compaction blocks in assistant history and streamed content
In the prompt-cache policy:
- keep system/tool/workspace prefix at 1h
- keep trailing conversation / compaction / user prefix at 5m
- avoid rewriting a still-hot cached conversation prefix unless needed for correctness

Why This Matters

It reduces wasted cache-write spend on volatile conversation content.
It lets provider-native compaction manage the active Claude prompt while lossless-claw continues to provide searchable durable memory.
It preserves cache value for long coding sessions where follow-up turns are frequent.

Acceptance Criteria

Anthropic requests can opt into native compaction via model/run params.
Required beta headers are sent only when needed.
Compaction blocks round-trip through request history and streaming output.
Long Anthropic cache retention applies 1h only to the stable prefix and keeps trailing conversational content at 5m.
Existing Anthropic behavior remains unchanged unless the new params are enabled.

Related Work

openclaw/openclaw#65233 (background turn maintenance for context engines)
Martian-Engineering/lossless-claw#407
Martian-Engineering/lossless-claw#408

extent analysis

TL;DR

Enable Anthropic native compaction by adding request parameters and modifying the transport path to reduce wasted cache-write spend on volatile conversation content.

Guidance

Add Anthropic request parameters such as anthropicServerCompaction and anthropicCompactThreshold to opt into native compaction.
Modify the Anthropic transport path to inject context_management.edits with compact_20260112 when enabled.
Update the prompt-cache policy to preserve stable system/tool/workspace prefixes with 1h cache retention and keep high-churn conversation/trailing user prefixes at 5m retention.
Verify that compaction blocks round-trip through request history and streaming output to ensure correct functionality.

Example

No explicit code snippet is provided, but the proposed changes suggest adding parameters such as anthropicServerCompaction: boolean to Anthropic requests.

Notes

The solution relies on the Anthropic provider supporting native compaction, and the implementation should ensure that existing behavior remains unchanged unless the new parameters are enabled.

Recommendation

Apply the workaround by enabling Anthropic native compaction via the proposed changes, as this reduces wasted cache-write spend and preserves cache value for long coding sessions.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#network issue #logging issue #authentication issue #prompt issue #agent setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Add Anthropic native compaction and mixed prompt-cache TTL support [3 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #65288: Add Anthropic native compaction and mixed cache TTLs

Description (problem / solution / changelog)

Summary

What Changed

Why

Validation

Related Work

Changed files

PR #408: Defer proactive compaction debt and surface LCM maintenance state

Description (problem / solution / changelog)

Summary

Why

What Changed

Safety

Validation

Companion Host Work

Changed files

PR #65233: Run context-engine turn maintenance as idle-aware background work

Description (problem / solution / changelog)

Summary

Why

What Changed

Guarantees

Validation

Companion Plugin Work

Changed files

Summary

Problem

Design Goal

Proposed Changes

Why This Matters

Acceptance Criteria

Related Work

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING