openclaw - ✅(Solved) Fix Add Anthropic native compaction and mixed prompt-cache TTL support [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#65287Fetched 2026-04-12 13:24:55
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
cross-referenced ×3

Anthropic prompt caching is exact-prefix and cache-invalidating on prompt-prefix rewrites, so local prompt-mutating compaction can burn a freshly written cache even when lossless-claw is no longer stalling the foreground turn. OpenClaw already has a provider-native compaction precedent for OpenAI Responses; Anthropic should get the same first-class support, plus mixed prompt-cache TTLs that keep the expensive stable prefix warm without paying 1-hour write costs for high-churn conversation content.

Root Cause

  • It reduces wasted cache-write spend on volatile conversation content.
  • It lets provider-native compaction manage the active Claude prompt while lossless-claw continues to provide searchable durable memory.
  • It preserves cache value for long coding sessions where follow-up turns are frequent.

Fix Action

Fixed

PR fix notes

PR #65288: Add Anthropic native compaction and mixed cache TTLs

Description (problem / solution / changelog)

Summary

Closes #65287.

This adds Anthropic-native compaction support and a mixed Anthropic cache-retention policy that uses 1h retention for the stable system/tool/workspace prefix and 5m retention for high-churn conversation content. It also round-trips Anthropic compaction blocks through request history and streaming so provider-native active-context compaction can manage the live Claude prompt without giving up durable searchable memory in lossless-claw.

What Changed

  • Added Anthropic run/model params:
    • anthropicServerCompaction
    • anthropicCompactThreshold
    • anthropicCompactPauseAfter
    • anthropicCompactInstructions
  • Injected context_management.edits with compact_20260112 when Anthropic native compaction is enabled.
  • Added the required Anthropic beta features only when native compaction is active or compaction blocks must be round-tripped.
  • Extended Anthropic history conversion and streaming to preserve compaction blocks.
  • Updated Anthropic cache policy so long retention keeps the stable prefix at 1h while trailing conversation content remains 5m/ephemeral.

Why

  • Anthropic prompt caching is exact-prefix, so local prompt rewrites can burn a hot conversation cache.
  • Long-running Claude coding sessions benefit from a stable 1h cached prefix, but paying 1h write cost for fast-changing conversation blocks is wasteful.
  • This pairs well with the lossless-claw deferred/background compaction work: provider-native compaction manages the active prompt, while lossless-claw stays the searchable lossless sidecar.

Validation

  • pnpm test -- --run src/agents/anthropic-payload-policy.test.ts src/agents/anthropic-transport-stream.test.ts src/agents/pi-embedded-runner-extraparams.test.ts passed
  • NODE_OPTIONS=--max-old-space-size=8192 pnpm exec tsc -p tsconfig.json --noEmit reports only the pre-existing unrelated Telegram AbortSignal baseline in extensions/telegram/src/bot.ts

Related Work

  • openclaw/openclaw#65233
  • Martian-Engineering/lossless-claw#407
  • Martian-Engineering/lossless-claw#408

Changed files

  • src/agents/anthropic-payload-policy.test.ts (modified, +79/-2)
  • src/agents/anthropic-payload-policy.ts (modified, +117/-6)
  • src/agents/anthropic-transport-stream.test.ts (modified, +384/-0)
  • src/agents/anthropic-transport-stream.ts (modified, +208/-43)
  • src/agents/pi-embedded-runner-extraparams.test.ts (modified, +32/-0)
  • src/agents/pi-embedded-runner/extra-params.ts (modified, +22/-0)
  • src/agents/pi-embedded-runner/run.incomplete-turn.test.ts (modified, +62/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +22/-0)
  • src/agents/pi-embedded-runner/run/incomplete-turn.ts (modified, +15/-1)
  • src/agents/transport-message-transform.ts (modified, +9/-1)

PR #408: Defer proactive compaction debt and surface LCM maintenance state

Description (problem / solution / changelog)

Summary

Closes #407.

This changes proactive compaction from inline turn work into deferred maintenance debt by default, then makes that deferred path cache-safe for Anthropic. The result is a hybrid model: proactive compaction no longer blocks the foreground afterTurn() path, and prompt-mutating deferred compaction no longer rewrites a still-hot Anthropic cache.

Why

  • Inline proactive compaction can keep the main session lane busy after the assistant reply is already visible.
  • That starves immediate follow-up turns and can push subagent completion announce traffic into timeouts.
  • Anthropic prompt caching is exact-prefix, so rewriting the active prompt too soon can burn a freshly written cache and drive up cost.
  • Orchestrator-heavy sessions make the pain worse: one main agent plus up to 4 subagents may all need fast LCM reads while one hot session is still writing.

What Changed

  • Added proactiveThresholdCompactionMode: "deferred" | "inline"; default is "deferred".
  • afterTurn() now records coalesced proactive compaction debt per conversation/session instead of running proactive threshold or leaf compaction inline.
  • maintain() now consumes that debt only when runtime context explicitly allows deferred execution, and it skips prompt-mutating deferred compaction while Anthropic cache is still hot.
  • assemble() now consumes deferred prompt-mutating compaction pre-assembly when the cache is cold or the prompt is approaching overflow.
  • Added persistent maintenance state for pending/running/last success/last failure metadata.
  • Added cacheAwareCompaction.cacheTTLSeconds (default 300) and telemetry for lastApiCallAt, lastCacheTouchAt, provider, and model.
  • Extended LCM status/command output and startup banners to surface maintenance state and cache-aware compaction context.
  • Preserved the local same-turn safety guard in legacy inline mode so leaf and threshold compaction do not both fire on the same turn.
  • Kept manual, overflow, and timeout compaction synchronous.
  • Advertised turnMaintenanceMode: "background" for the companion OpenClaw host change.

Safety

  • Legacy inline mode remains available as a rollback/debug escape hatch.
  • Read-only LCM tools continue to work while compaction debt is pending.
  • Public tool inputs stay stable apart from the new config option.
  • Anthropic-active sessions keep their hot cache unless compaction is needed for correctness or the cache has already gone cold.

Validation

  • npm test -> 40 files / 709 tests passed
  • npm run build passed

Companion Host Work

  • Companion OpenClaw issue: openclaw/openclaw#65220
  • Companion OpenClaw PR: openclaw/openclaw#65233
  • Anthropic-native companion issue: openclaw/openclaw#65287
  • Anthropic-native companion PR: openclaw/openclaw#65288

Changed files

  • .changeset/deferred-compaction-maintenance.md (added, +5/-0)
  • README.md (modified, +9/-0)
  • docs/configuration.md (modified, +15/-0)
  • openclaw.plugin.json (modified, +19/-0)
  • skills/lossless-claw/references/config.md (modified, +25/-0)
  • src/db/config.ts (modified, +23/-0)
  • src/db/migration.ts (modified, +34/-0)
  • src/engine.ts (modified, +774/-375)
  • src/plugin/index.ts (modified, +6/-1)
  • src/plugin/lcm-command.ts (modified, +75/-3)
  • src/startup-banner-log.ts (modified, +1/-0)
  • src/store/compaction-maintenance-store.ts (added, +219/-0)
  • src/store/compaction-telemetry-store.ts (modified, +33/-1)
  • src/store/index.ts (modified, +5/-0)
  • test/bootstrap-flood-regression.test.ts (modified, +1/-0)
  • test/circuit-breaker.test.ts (modified, +1/-0)
  • test/config.test.ts (modified, +25/-0)
  • test/engine.test.ts (modified, +391/-27)
  • test/expansion.test.ts (modified, +1/-0)
  • test/lcm-command.test.ts (modified, +52/-0)
  • test/lcm-expand-query-tool.test.ts (modified, +1/-0)
  • test/lcm-expand-tool.test.ts (modified, +1/-0)
  • test/lcm-tools.test.ts (modified, +1/-0)
  • test/plugin-config-registration.test.ts (modified, +17/-3)
  • test/session-operation-queues.test.ts (modified, +1/-0)

PR #65233: Run context-engine turn maintenance as idle-aware background work

Description (problem / solution / changelog)

Summary

Closes #65220.

This teaches context engines to opt into background turn maintenance and moves turn-triggered maintenance off the foreground session lane when turnMaintenanceMode === "background". For lossless-claw, that means proactive post-turn maintenance can be queued as a hidden task instead of blocking the next user turn.

Why

  • Today the host awaits both afterTurn() and maintain(... reason: "turn") inline on the session path.
  • Even if a plugin removes heavy foreground work, the host still controls whether turn maintenance blocks the lane.
  • The result is visible assistant output followed by dead time before the next command can start.
  • Subagent completion announce traffic can time out behind a stalled main lane.

What Changed

  • Added turnMaintenanceMode?: "foreground" | "background" to context-engine info.
  • Added allowDeferredCompactionExecution?: boolean to runtime context.
  • When a context engine opts into background mode, runContextEngineMaintenance(... reason: "turn") now:
    • queues one hidden maintenance task per session key
    • coalesces repeated requests
    • waits for the session lane to go idle before executing
    • runs maintenance with allowDeferredCompactionExecution: true
    • uses task delivery/status infrastructure instead of a visible subagent or announce flow
  • Long-running or failed deferred maintenance can surface subtle task updates to the session.
  • Fast deferred maintenance stays silent.

Guarantees

  • Proactive turn maintenance no longer blocks the immediate next foreground turn for engines that opt in.
  • Overflow/timeout/manual compaction remains synchronous in the plugin/runtime paths that require correctness.
  • Hidden maintenance is not a child agent run and does not create synthetic subagent sessions.

Validation

  • pnpm test -- --run src/agents/pi-embedded-runner/context-engine-maintenance.test.ts src/infra/backoff.test.ts passed
  • NODE_OPTIONS=--max-old-space-size=8192 pnpm exec tsc -p tsconfig.json --noEmit still reports preexisting unrelated AbortSignal type errors in unchanged extensions/telegram/src/bot.ts

Companion Plugin Work

  • Companion lossless-claw issue: Martian-Engineering/lossless-claw#407
  • Companion lossless-claw PR: Martian-Engineering/lossless-claw#408

Changed files

  • src/agents/pi-embedded-runner/context-engine-maintenance.test.ts (modified, +1013/-3)
  • src/agents/pi-embedded-runner/context-engine-maintenance.ts (modified, +566/-21)
  • src/context-engine/types.ts (modified, +12/-0)
  • src/infra/backoff.test.ts (modified, +35/-0)
  • src/infra/backoff.ts (modified, +40/-9)
RAW_BUFFERClick to expand / collapse

Summary

Anthropic prompt caching is exact-prefix and cache-invalidating on prompt-prefix rewrites, so local prompt-mutating compaction can burn a freshly written cache even when lossless-claw is no longer stalling the foreground turn. OpenClaw already has a provider-native compaction precedent for OpenAI Responses; Anthropic should get the same first-class support, plus mixed prompt-cache TTLs that keep the expensive stable prefix warm without paying 1-hour write costs for high-churn conversation content.

Problem

  • Anthropic-heavy coding sessions pay large prompt-cache write costs on Opus/Claude when long retention is applied too broadly.
  • Local prompt rewrites after a turn can invalidate a still-hot conversation cache, reducing the value of lossless-claw compaction even after the no-stall/background-maintenance work lands.
  • Right now OpenClaw does not expose Anthropic native compaction or a mixed TTL policy that matches Anthropic's cache model.

Design Goal

  • Use Anthropic native active-context compaction when the provider supports it.
  • Preserve stable system/tool/workspace prefixes with 1h cache retention.
  • Keep high-churn conversation/trailing user prefixes at 5m retention by default.
  • Round-trip Anthropic compaction blocks through the transport so follow-up requests can reuse the provider-native compacted prefix.
  • Keep this separate from lossless-claw, which should stay the lossless searchable sidecar rather than the primary live prompt shaper.

Proposed Changes

  • Add Anthropic request params:
    • anthropicServerCompaction: boolean
    • anthropicCompactThreshold: number
    • anthropicCompactPauseAfter?: boolean
    • anthropicCompactInstructions?: string
  • In the Anthropic transport path:
    • inject context_management.edits with compact_20260112 when enabled
    • add the required Anthropic beta features
    • preserve compaction blocks in assistant history and streamed content
  • In the prompt-cache policy:
    • keep system/tool/workspace prefix at 1h
    • keep trailing conversation / compaction / user prefix at 5m
    • avoid rewriting a still-hot cached conversation prefix unless needed for correctness

Why This Matters

  • It reduces wasted cache-write spend on volatile conversation content.
  • It lets provider-native compaction manage the active Claude prompt while lossless-claw continues to provide searchable durable memory.
  • It preserves cache value for long coding sessions where follow-up turns are frequent.

Acceptance Criteria

  • Anthropic requests can opt into native compaction via model/run params.
  • Required beta headers are sent only when needed.
  • Compaction blocks round-trip through request history and streaming output.
  • Long Anthropic cache retention applies 1h only to the stable prefix and keeps trailing conversational content at 5m.
  • Existing Anthropic behavior remains unchanged unless the new params are enabled.

Related Work

  • openclaw/openclaw#65233 (background turn maintenance for context engines)
  • Martian-Engineering/lossless-claw#407
  • Martian-Engineering/lossless-claw#408

extent analysis

TL;DR

Enable Anthropic native compaction by adding request parameters and modifying the transport path to reduce wasted cache-write spend on volatile conversation content.

Guidance

  • Add Anthropic request parameters such as anthropicServerCompaction and anthropicCompactThreshold to opt into native compaction.
  • Modify the Anthropic transport path to inject context_management.edits with compact_20260112 when enabled.
  • Update the prompt-cache policy to preserve stable system/tool/workspace prefixes with 1h cache retention and keep high-churn conversation/trailing user prefixes at 5m retention.
  • Verify that compaction blocks round-trip through request history and streaming output to ensure correct functionality.

Example

No explicit code snippet is provided, but the proposed changes suggest adding parameters such as anthropicServerCompaction: boolean to Anthropic requests.

Notes

The solution relies on the Anthropic provider supporting native compaction, and the implementation should ensure that existing behavior remains unchanged unless the new parameters are enabled.

Recommendation

Apply the workaround by enabling Anthropic native compaction via the proposed changes, as this reduces wasted cache-write spend and preserves cache value for long coding sessions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING