openclaw - ✅(Solved) Fix cache_control not applied to system prompt on direct Anthropic provider path (cacheRead=0) [1 pull requests, 1 participants]

openclaw2026-03-30 22:05:52

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#57958•Fetched 2026-04-08 01:55:41

View on GitHub

Comments

Participants

Timeline

Reactions

Author

BuildIQAdvisors

Participants

BuildIQAdvisors

Timeline (top)

closed ×1cross-referenced ×1locked ×1

Prompt caching (\cache_control: { type: ephemeral }) is applied to system messages on the OpenRouter provider path via \createOpenRouterSystemCacheWrapper, but is NOT applied on the direct Anthropic provider path. This results in \cacheRead=0\ on every turn for users using the direct Anthropic API, causing significant unnecessary cost.

Root Cause

Root Cause (from source analysis of pi-embedded-BaSvmUpW.js)

Fix Action

Workaround

Routing through \openrouter/auto\ activates the cache wrapper but introduces uncontrolled model selection (routes to Opus instead of Sonnet despite account-level defaults), negating cost savings. No clean workaround exists.

PR fix notes

PR #59054: fix(agents): split system prompt cache prefix by transport

Repository: openclaw/openclaw
Author: vincentkoc
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/59054

Description (problem / solution / changelog)

Summary

Problem: Anthropic-family prompt caching still lost KV reuse when dynamic lab/session/system additions changed, because the current transport stack no longer had the old cache seam and the shared payload policy cache-tagged the whole system block.
Why it matters: with many labs and large prompt surfaces, per-turn dynamic suffix churn rewrites expensive stable prompt prefixes and multiplies cache misses.
What changed: restored an internal system-prompt cache boundary as a shared helper, moved context-engine prompt additions behind that boundary, split Anthropic system blocks into cached stable and uncached dynamic regions in the shared payload policy, and stripped the internal marker before emission across non-Anthropic transports plus CLI backend system-prompt args.
What did NOT change (scope boundary): provider-visible prompt text is unchanged apart from removing the internal marker; no config/env changes were introduced; no new cache writes happen when cacheRetention: "none".

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #50511
Closes #57958
Related #53225
This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

Root cause: the original fix lived in an older wrapper path, but upstream provider transport/KV work centralized Anthropic request shaping in shared payload policy code and removed the effective cache seam from the current path.
Secondary regressions: the initial refresh restored the Anthropic seam but missed the OpenAI Completions serializer path, and the CLI backend path still forwarded the internal cache boundary marker unchanged via systemPromptArg.
Missing detection / guardrail: there was no current-main regression test ensuring dynamic system additions, including context-engine prepends, stay outside the cached Anthropic prefix while the internal marker is stripped for all other emission paths.
Contributing context (if known): upstream merged a large amount of KV/provider transport work after the original PR branch was cut, so the old patch shape no longer matched the live request path.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file:
- src/agents/system-prompt-cache-boundary.test.ts
- src/agents/anthropic-payload-policy.test.ts
- src/agents/openai-transport-stream.test.ts
- src/agents/openai-ws-stream.test.ts
- src/agents/cli-runner.helpers.test.ts
Scenario the test should lock in: Anthropic-family requests cache only the stable system prefix; dynamic lab/session suffix content stays uncached; OpenAI Responses, OpenAI Completions, OpenAI WebSocket, Google transport paths, and CLI backend system-prompt args strip the internal boundary marker before emission.
Why this is the smallest reliable guardrail: the regression lives in prompt assembly plus transport/CLI payload shaping, not in end-user workflow code.
Existing test that already covers this (if any): prompt stability coverage existed, but it did not protect the transport-layer cache seam.
If no new test is added, why not: N/A.

User-visible / Behavior Changes

Anthropic-family sessions can preserve the stable system prompt prefix across turns again even when lab/group/session additions change later in the prompt. Other request paths keep seeing the same prompt text, with the internal boundary stripped before emission.

Diagram (if applicable)

Before:
stable prompt prefix + dynamic lab/session suffix -> one cached Anthropic system block -> suffix churn rewrites the full prefix
non-Anthropic/CLI emission paths -> some paths still forwarded the raw internal boundary marker

After:
stable prompt prefix -> cached Anthropic block
dynamic lab/session suffix -> uncached Anthropic block
non-Anthropic/CLI emission paths -> same prompt text with boundary stripped before emission

Security Impact (required)

New permissions/capabilities? (No)
Secrets/tokens handling changed? (No)
New/changed network calls? (No)
Command/tool execution surface changed? (No)
Data access scope changed? (No)
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: macOS
Runtime/container: Node 25 / pnpm
Model/provider: Anthropic-family payload shaping plus OpenAI/Google/WebSocket/CLI strip paths
Integration/channel (if any): N/A
Relevant config (redacted): prompt boundary present vs absent; cacheRetention long vs none

Steps

Build a system prompt containing the internal cache boundary.
Apply Anthropic payload policy with cache retention enabled.
Apply non-Anthropic transport request builders or CLI backend arg builders using the same prompt.

Expected

Anthropic-family payloads split cached stable system content from uncached dynamic suffix content.
Other emission paths strip the internal boundary marker and emit unchanged prompt text semantics.

Actual

Before this update, current-main shaping cache-tagged the whole Anthropic system block and had no equivalent seam for later dynamic additions.
Before the follow-up fixes in this PR, the OpenAI Completions path and the CLI backend systemPromptArg path could still emit the raw internal marker.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios: git diff --check; direct module import sanity for touched source files; direct sanity check that OpenAI Completions now strips the internal boundary marker.
Edge cases checked: boundary stripped when cache retention is disabled; context-engine prepend path moves additions behind the cache seam; non-Anthropic request builders strip the marker before emission, including OpenAI Completions; CLI backend system-prompt args also strip the marker before emission.
What you did not verify: full Vitest lanes, pnpm check, pnpm build, or live provider telemetry in this update path.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes)
Config/env changes? (No)
Migration needed? (No)
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: an internal boundary marker could leak into provider requests on an emission path that was missed.
- Mitigation: Anthropic strips/splits in shared payload policy; OpenAI Responses, OpenAI Completions, OpenAI WebSocket, Google builders, and CLI backend system-prompt args explicitly strip before emission.
Risk: moving the seam too early would reduce cache leverage for large stable bootstrap context.
- Mitigation: the boundary is placed late in buildAgentSystemPrompt() so stable prompt context stays cacheable and only dynamic lab/session additions sit behind it.

Changed files

CHANGELOG.md (modified, +1/-0)
src/agents/anthropic-payload-policy.test.ts (modified, +62/-0)
src/agents/anthropic-payload-policy.ts (modified, +56/-3)
src/agents/cli-runner.helpers.test.ts (modified, +16/-0)
src/agents/cli-runner/helpers.ts (modified, +2/-1)
src/agents/google-transport-stream.ts (modified, +6/-1)
src/agents/openai-transport-stream.test.ts (modified, +97/-0)
src/agents/openai-transport-stream.ts (modified, +9/-2)
src/agents/openai-ws-request.ts (modified, +4/-1)
src/agents/openai-ws-stream.test.ts (modified, +31/-0)
src/agents/openai-ws-stream.ts (modified, +4/-1)
src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts (modified, +2/-4)
src/agents/pi-embedded-runner/run/attempt.test.ts (modified, +61/-0)
src/agents/pi-embedded-runner/run/attempt.ts (modified, +21/-2)
src/agents/provider-transport-stream.ts (modified, +24/-13)
src/agents/system-prompt-cache-boundary.test.ts (added, +33/-0)
src/agents/system-prompt-cache-boundary.ts (added, +38/-0)
src/agents/system-prompt.ts (modified, +13/-6)

RAW_BUFFERClick to expand / collapse

Bug Report

OpenClaw version: 2026.3.28 Provider: anthropic (direct) Model: anthropic/claude-sonnet-4-6

Summary

Root Cause (from source analysis of pi-embedded-BaSvmUpW.js)

The function \createOpenRouterSystemCacheWrapper\ correctly wraps system prompt content with \cache_control: { type: ephemeral }\ but is gated behind \isOpenRouterAnthropicModel(). The direct Anthropic path (\provider === anthropic) goes through
esolveCacheRetention()\ which sets TTL duration but does NOT inject the \cache_control\ marker on the message content — the actual mechanism Anthropic uses to establish a cache entry.

Impact

\cacheRead=0\ on every turn for direct Anthropic users
Full input token cost on every turn for repeated system prompt content
On a long working session (100k+ token system prompt, 100+ turns): ~10x higher cost than expected
Real-world example: single 14-hour session cost .89 due to this issue

Expected Behavior

Direct Anthropic provider path should apply \cache_control: { type: ephemeral }\ to system prompt messages, consistent with OpenRouter path behavior.

Workaround

Request

Apply equivalent \cache_control\ injection to the direct Anthropic provider path, or expose a config option to explicitly enable it.

extent analysis

Fix Plan

To apply equivalent cache control injection to the direct Anthropic provider path, follow these steps:

Modify the resolveCacheRetention function to inject the cache_control marker on the message content.
Add a check for the anthropic provider and apply the cache_control wrapper when necessary.

Example code:

function resolveCacheRetention(provider, message) {
  if (provider === 'anthropic') {
    // Inject cache_control marker on message content
    message.cache_control = { type: 'ephemeral' };
  }
  // ... existing TTL duration logic ...
}

Alternatively, expose a config option to explicitly enable cache control injection for the direct Anthropic provider path:

const config = {
  // ... existing config options ...
  enableCacheControl: true, // default: false
};

function resolveCacheRetention(provider, message) {
  if (provider === 'anthropic' && config.enableCacheControl) {
    message.cache_control = { type: 'ephemeral' };
  }
  // ... existing TTL duration logic ...
}

Verification

To verify the fix, monitor the cacheRead metric for direct Anthropic users and ensure it is no longer 0 on every turn. Additionally, check the input token cost for repeated system prompt content and verify it is reduced as expected.

Extra Tips

Ensure the cache_control wrapper is correctly applied to system prompt messages for both OpenRouter and direct Anthropic provider paths.
Consider adding logging or monitoring to track cache hits and misses for further optimization.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #authentication issue #prompt issue #agent setup #task chaining

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix cache_control not applied to system prompt on direct Anthropic provider path (cacheRead=0) [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Root Cause (from source analysis of pi-embedded-BaSvmUpW.js)

Fix Action

Workaround

PR fix notes

PR #59054: fix(agents): split system prompt cache prefix by transport

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause / Regression History (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

Bug Report

Summary

Root Cause (from source analysis of pi-embedded-BaSvmUpW.js)

Impact

Expected Behavior

Workaround

Request

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING