openclaw - ✅(Solved) Fix cache_control not applied to system prompt on direct Anthropic provider path (cacheRead=0) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#57958Fetched 2026-04-08 01:55:41
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
closed ×1cross-referenced ×1locked ×1

Prompt caching (\cache_control: { type: ephemeral }) is applied to system messages on the OpenRouter provider path via \createOpenRouterSystemCacheWrapper, but is NOT applied on the direct Anthropic provider path. This results in \cacheRead=0\ on every turn for users using the direct Anthropic API, causing significant unnecessary cost.

Root Cause

Root Cause (from source analysis of pi-embedded-BaSvmUpW.js)

Fix Action

Workaround

Routing through \openrouter/auto\ activates the cache wrapper but introduces uncontrolled model selection (routes to Opus instead of Sonnet despite account-level defaults), negating cost savings. No clean workaround exists.

PR fix notes

PR #59054: fix(agents): split system prompt cache prefix by transport

Description (problem / solution / changelog)

Summary

  • Problem: Anthropic-family prompt caching still lost KV reuse when dynamic lab/session/system additions changed, because the current transport stack no longer had the old cache seam and the shared payload policy cache-tagged the whole system block.
  • Why it matters: with many labs and large prompt surfaces, per-turn dynamic suffix churn rewrites expensive stable prompt prefixes and multiplies cache misses.
  • What changed: restored an internal system-prompt cache boundary as a shared helper, moved context-engine prompt additions behind that boundary, split Anthropic system blocks into cached stable and uncached dynamic regions in the shared payload policy, and stripped the internal marker before emission across non-Anthropic transports plus CLI backend system-prompt args.
  • What did NOT change (scope boundary): provider-visible prompt text is unchanged apart from removing the internal marker; no config/env changes were introduced; no new cache writes happen when cacheRetention: "none".

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #50511
  • Closes #57958
  • Related #53225
  • This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

  • Root cause: the original fix lived in an older wrapper path, but upstream provider transport/KV work centralized Anthropic request shaping in shared payload policy code and removed the effective cache seam from the current path.
  • Secondary regressions: the initial refresh restored the Anthropic seam but missed the OpenAI Completions serializer path, and the CLI backend path still forwarded the internal cache boundary marker unchanged via systemPromptArg.
  • Missing detection / guardrail: there was no current-main regression test ensuring dynamic system additions, including context-engine prepends, stay outside the cached Anthropic prefix while the internal marker is stripped for all other emission paths.
  • Contributing context (if known): upstream merged a large amount of KV/provider transport work after the original PR branch was cut, so the old patch shape no longer matched the live request path.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/agents/system-prompt-cache-boundary.test.ts
    • src/agents/anthropic-payload-policy.test.ts
    • src/agents/openai-transport-stream.test.ts
    • src/agents/openai-ws-stream.test.ts
    • src/agents/cli-runner.helpers.test.ts
  • Scenario the test should lock in: Anthropic-family requests cache only the stable system prefix; dynamic lab/session suffix content stays uncached; OpenAI Responses, OpenAI Completions, OpenAI WebSocket, Google transport paths, and CLI backend system-prompt args strip the internal boundary marker before emission.
  • Why this is the smallest reliable guardrail: the regression lives in prompt assembly plus transport/CLI payload shaping, not in end-user workflow code.
  • Existing test that already covers this (if any): prompt stability coverage existed, but it did not protect the transport-layer cache seam.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

Anthropic-family sessions can preserve the stable system prompt prefix across turns again even when lab/group/session additions change later in the prompt. Other request paths keep seeing the same prompt text, with the internal boundary stripped before emission.

Diagram (if applicable)

Before:
stable prompt prefix + dynamic lab/session suffix -> one cached Anthropic system block -> suffix churn rewrites the full prefix
non-Anthropic/CLI emission paths -> some paths still forwarded the raw internal boundary marker

After:
stable prompt prefix -> cached Anthropic block
dynamic lab/session suffix -> uncached Anthropic block
non-Anthropic/CLI emission paths -> same prompt text with boundary stripped before emission

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 25 / pnpm
  • Model/provider: Anthropic-family payload shaping plus OpenAI/Google/WebSocket/CLI strip paths
  • Integration/channel (if any): N/A
  • Relevant config (redacted): prompt boundary present vs absent; cacheRetention long vs none

Steps

  1. Build a system prompt containing the internal cache boundary.
  2. Apply Anthropic payload policy with cache retention enabled.
  3. Apply non-Anthropic transport request builders or CLI backend arg builders using the same prompt.

Expected

  • Anthropic-family payloads split cached stable system content from uncached dynamic suffix content.
  • Other emission paths strip the internal boundary marker and emit unchanged prompt text semantics.

Actual

  • Before this update, current-main shaping cache-tagged the whole Anthropic system block and had no equivalent seam for later dynamic additions.
  • Before the follow-up fixes in this PR, the OpenAI Completions path and the CLI backend systemPromptArg path could still emit the raw internal marker.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: git diff --check; direct module import sanity for touched source files; direct sanity check that OpenAI Completions now strips the internal boundary marker.
  • Edge cases checked: boundary stripped when cache retention is disabled; context-engine prepend path moves additions behind the cache seam; non-Anthropic request builders strip the marker before emission, including OpenAI Completions; CLI backend system-prompt args also strip the marker before emission.
  • What you did not verify: full Vitest lanes, pnpm check, pnpm build, or live provider telemetry in this update path.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: an internal boundary marker could leak into provider requests on an emission path that was missed.
    • Mitigation: Anthropic strips/splits in shared payload policy; OpenAI Responses, OpenAI Completions, OpenAI WebSocket, Google builders, and CLI backend system-prompt args explicitly strip before emission.
  • Risk: moving the seam too early would reduce cache leverage for large stable bootstrap context.
    • Mitigation: the boundary is placed late in buildAgentSystemPrompt() so stable prompt context stays cacheable and only dynamic lab/session additions sit behind it.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/anthropic-payload-policy.test.ts (modified, +62/-0)
  • src/agents/anthropic-payload-policy.ts (modified, +56/-3)
  • src/agents/cli-runner.helpers.test.ts (modified, +16/-0)
  • src/agents/cli-runner/helpers.ts (modified, +2/-1)
  • src/agents/google-transport-stream.ts (modified, +6/-1)
  • src/agents/openai-transport-stream.test.ts (modified, +97/-0)
  • src/agents/openai-transport-stream.ts (modified, +9/-2)
  • src/agents/openai-ws-request.ts (modified, +4/-1)
  • src/agents/openai-ws-stream.test.ts (modified, +31/-0)
  • src/agents/openai-ws-stream.ts (modified, +4/-1)
  • src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts (modified, +2/-4)
  • src/agents/pi-embedded-runner/run/attempt.test.ts (modified, +61/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +21/-2)
  • src/agents/provider-transport-stream.ts (modified, +24/-13)
  • src/agents/system-prompt-cache-boundary.test.ts (added, +33/-0)
  • src/agents/system-prompt-cache-boundary.ts (added, +38/-0)
  • src/agents/system-prompt.ts (modified, +13/-6)
RAW_BUFFERClick to expand / collapse

Bug Report

OpenClaw version: 2026.3.28 Provider: anthropic (direct) Model: anthropic/claude-sonnet-4-6

Summary

Prompt caching (\cache_control: { type: ephemeral }) is applied to system messages on the OpenRouter provider path via \createOpenRouterSystemCacheWrapper, but is NOT applied on the direct Anthropic provider path. This results in \cacheRead=0\ on every turn for users using the direct Anthropic API, causing significant unnecessary cost.

Root Cause (from source analysis of pi-embedded-BaSvmUpW.js)

The function \createOpenRouterSystemCacheWrapper\ correctly wraps system prompt content with \cache_control: { type: ephemeral }\ but is gated behind \isOpenRouterAnthropicModel(). The direct Anthropic path (\provider === anthropic) goes through
esolveCacheRetention()\ which sets TTL duration but does NOT inject the \cache_control\ marker on the message content — the actual mechanism Anthropic uses to establish a cache entry.

Impact

  • \cacheRead=0\ on every turn for direct Anthropic users
  • Full input token cost on every turn for repeated system prompt content
  • On a long working session (100k+ token system prompt, 100+ turns): ~10x higher cost than expected
  • Real-world example: single 14-hour session cost .89 due to this issue

Expected Behavior

Direct Anthropic provider path should apply \cache_control: { type: ephemeral }\ to system prompt messages, consistent with OpenRouter path behavior.

Workaround

Routing through \openrouter/auto\ activates the cache wrapper but introduces uncontrolled model selection (routes to Opus instead of Sonnet despite account-level defaults), negating cost savings. No clean workaround exists.

Request

Apply equivalent \cache_control\ injection to the direct Anthropic provider path, or expose a config option to explicitly enable it.

extent analysis

Fix Plan

To apply equivalent cache control injection to the direct Anthropic provider path, follow these steps:

  • Modify the resolveCacheRetention function to inject the cache_control marker on the message content.
  • Add a check for the anthropic provider and apply the cache_control wrapper when necessary.

Example code:

function resolveCacheRetention(provider, message) {
  if (provider === 'anthropic') {
    // Inject cache_control marker on message content
    message.cache_control = { type: 'ephemeral' };
  }
  // ... existing TTL duration logic ...
}

Alternatively, expose a config option to explicitly enable cache control injection for the direct Anthropic provider path:

const config = {
  // ... existing config options ...
  enableCacheControl: true, // default: false
};

function resolveCacheRetention(provider, message) {
  if (provider === 'anthropic' && config.enableCacheControl) {
    message.cache_control = { type: 'ephemeral' };
  }
  // ... existing TTL duration logic ...
}

Verification

To verify the fix, monitor the cacheRead metric for direct Anthropic users and ensure it is no longer 0 on every turn. Additionally, check the input token cost for repeated system prompt content and verify it is reduced as expected.

Extra Tips

  • Ensure the cache_control wrapper is correctly applied to system prompt messages for both OpenRouter and direct Anthropic provider paths.
  • Consider adding logging or monitoring to track cache hits and misses for further optimization.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix cache_control not applied to system prompt on direct Anthropic provider path (cacheRead=0) [1 pull requests, 1 participants]