openclaw - ✅(Solved) Fix [Bug]: Prompt cache busted cross-run for task-scoped adapters: per-run chat_id in inbound_meta placed above cache breakpoint [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#65056Fetched 2026-04-12 13:25:46
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×2cross-referenced ×1

For adapters that use per-task or per-issue session scoping (e.g. the openclaw_gateway Paperclip adapter with sessionKeyStrategy: "issue"), every run cold-starts with cacheRead=0 and pays a full cacheWrite of 18–25K tokens. This makes cross-run prompt caching impossible regardless of cacheRetention setting.

Root Cause

Root cause: the chat_id field in the ## Inbound Context (trusted metadata) block maps to ctx.OriginatingTo, which for task-scoped adapters resolves to the per-task session key (e.g. paperclip:issue:<TASK_UUID>). This UUID is unique per task, making every system prompt unique and busting the Anthropic prompt cache unconditionally. There is only one cache_control: { type: "ephemeral" } breakpoint on the entire system array, so any volatile value anywhere invalidates the entire cache.

Fix Action

Fix / Workaround

Workaround confirmed: switching to sessionKeyStrategy: "fixed" with sessionKey: "paperclip" stabilizes chat_id and enables cross-run caching. Tradeoff: all runs share one session and accumulate conversation history across tasks.

PR fix notes

PR #65071: fix(prompt): keep inbound chat ids out of system prefix

Description (problem / solution / changelog)

What changed

  • removed volatile chat_id from the trusted inbound system metadata so task-scoped adapters do not poison the cache-sensitive system prefix
  • moved chat_id into the user-role conversation info block so the model still sees the routing context per turn
  • fixed the empty-body fast-path so prefix-only metadata does not trigger agent runs, while media-only turns still preserve inbound context plus the existing placeholder

Why

Task-scoped adapters such as Paperclip were placing a per-run session key in chat_id inside ## Inbound Context (trusted metadata). Because that block lives above the prompt-cache breakpoint, every task changed the system prompt bytes and prevented cross-run cache reuse.

Impact

  • restores cache stability for task-scoped adapter runs addressed by #65056
  • keeps direct external-channel chat_id available to the model in per-turn context
  • avoids a regression where metadata-only turns could bypass the empty-message reply path

Validation

  • pnpm test src/auto-reply/reply/inbound-meta.test.ts
  • pnpm test src/agents/prompt-composition.test.ts
  • pnpm test src/auto-reply/reply/get-reply-run.media-only.test.ts

Notes

  • repo-wide pnpm check is currently failing on unrelated pre-existing type issues in extensions/msteams, src/agents/tools-effective-inventory.ts, src/channels/plugins/registry-loaded.ts, src/gateway/server-channels.test.ts, and src/plugin-sdk/channel-runtime-context.ts

Linked Issue/PR

Closes #65056

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/auto-reply/reply/get-reply-run.media-only.test.ts (modified, +36/-0)
  • src/auto-reply/reply/get-reply-run.ts (modified, +7/-6)
  • src/auto-reply/reply/inbound-meta.test.ts (modified, +26/-2)
  • src/auto-reply/reply/inbound-meta.ts (modified, +6/-4)
  • test/helpers/plugins/tts-contract-suites.ts (modified, +22/-21)

Code Example

Session JSONL data from 8 consecutive Paperclip runs (same model, same API key, all on OpenClaw 2026.3.23-2):

| Run (session key) | T1 cacheRead | T1 cacheWrite |
|---|---|---|
| paperclip:issue:c585d0cc | 0 | 20,192 |
| paperclip:issue:ca527062 | 0 | 20,133 |
| paperclip:issue:3f22f673 | 0 | 19,584 |
| paperclip:issue:d7d6ed57 | 0 | 22,273 |
| paperclip:issue:5c1165ee | 0 | 17,823 |
| paperclip:issue:35edf3ee | 0 | 24,840 |

The problematic system prompt structure:


[OpenClaw framework sections — stable across runs]

Group Chat Context                        ← extraSystemPrompt injection point

Inbound Context (trusted metadata)

{
  "schema": "openclaw.inbound_meta.v1",
  "chat_id": "paperclip:issue:c585d0cc-...",VOLATILE: unique per Paperclip task
  "account_id": "default",
  "channel": "paperclip",
  "provider": "paperclip",
  "surface": "paperclip",
  "chat_type": "direct"
}

Project ContextAGENTS.md, SOUL.md, MEMORY.md, etc.

Silent Replies
Heartbeats
Runtime                                   ← single cache_control: {type: "ephemeral"} here


Workaround confirmed: switching to `sessionKeyStrategy: "fixed"` with `sessionKey: "paperclip"` stabilizes `chat_id` and enables cross-run caching. Tradeoff: all runs share one session and accumulate conversation history across tasks.
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

For adapters that use per-task or per-issue session scoping (e.g. the openclaw_gateway Paperclip adapter with sessionKeyStrategy: "issue"), every run cold-starts with cacheRead=0 and pays a full cacheWrite of 18–25K tokens. This makes cross-run prompt caching impossible regardless of cacheRetention setting.

Steps to reproduce

  1. Configure OpenClaw with openclaw_gateway Paperclip adapter using sessionKeyStrategy: "issue".
  2. Run the agent on multiple separate Paperclip tasks (each gets a unique task UUID).
  3. Observe session JSONL data: every run shows cacheRead=0 and a full cacheWrite of 18–25K tokens, regardless of cacheRetention setting.
  4. Compare: switch to sessionKeyStrategy: "fixed" with a shared sessionKey: "paperclip" — cross-run caching works correctly (cacheRead dominates from T2 onward).

Expected behavior

Cross-run prompt caching should work for task-scoped adapters. With cacheRetention: "long", T2+ runs within the TTL window should show cacheRead dominating and near-zero input token costs, as observed with stable-chat_id adapters like Telegram.

Actual behavior

Every run cold-starts with cacheRead=0 and pays a full cacheWrite of 18–25K tokens. Cross-run prompt caching never engages regardless of cacheRetention setting. Session JSONL data from 8 consecutive Paperclip runs confirms this pattern:

Run (session key)T1 cacheReadT1 cacheWrite
paperclip:issue:c585d0cc020,192
paperclip:issue:ca527062020,133
paperclip:issue:3f22f673019,584
paperclip:issue:d7d6ed57022,273
paperclip:issue:5c1165ee017,823
paperclip:issue:35edf3ee024,840

Root cause: the chat_id field in the ## Inbound Context (trusted metadata) block maps to ctx.OriginatingTo, which for task-scoped adapters resolves to the per-task session key (e.g. paperclip:issue:<TASK_UUID>). This UUID is unique per task, making every system prompt unique and busting the Anthropic prompt cache unconditionally. There is only one cache_control: { type: "ephemeral" } breakpoint on the entire system array, so any volatile value anywhere invalidates the entire cache.

For comparison, Telegram (stable chat_id per peer) shows cacheRead dominating from T2 onward — caching works correctly there.

OpenClaw version

2026.3.23-2

Operating system

macOs 26.3

Install method

No response

Model

anthropic/claude-sonnet (direct Anthropic API, api_key auth profile)

Provider / routing chain

openclaw -> direct Anthropic API (api_key auth profile)

Additional provider/model setup details

Direct Anthropic API with api_key auth profile. openclaw_gateway adapter (Paperclip), sessionKeyStrategy: "issue". System context ~18–22K tokens (standard workspace bootstrap files + tool list + skill list). cacheRetention: "long" (also reproduced with auto-seeded "short").

Logs, screenshots, and evidence

Session JSONL data from 8 consecutive Paperclip runs (same model, same API key, all on OpenClaw 2026.3.23-2):

| Run (session key) | T1 cacheRead | T1 cacheWrite |
|---|---|---|
| paperclip:issue:c585d0cc | 0 | 20,192 |
| paperclip:issue:ca527062 | 0 | 20,133 |
| paperclip:issue:3f22f673 | 0 | 19,584 |
| paperclip:issue:d7d6ed57 | 0 | 22,273 |
| paperclip:issue:5c1165ee | 0 | 17,823 |
| paperclip:issue:35edf3ee | 0 | 24,840 |

The problematic system prompt structure:


[OpenClaw framework sections — stable across runs]

Group Chat Context                        ← extraSystemPrompt injection point

Inbound Context (trusted metadata)

{
  "schema": "openclaw.inbound_meta.v1",
  "chat_id": "paperclip:issue:c585d0cc-...",  ← VOLATILE: unique per Paperclip task
  "account_id": "default",
  "channel": "paperclip",
  "provider": "paperclip",
  "surface": "paperclip",
  "chat_type": "direct"
}

Project Context                            ← AGENTS.md, SOUL.md, MEMORY.md, etc.

Silent Replies
Heartbeats
Runtime                                   ← single cache_control: {type: "ephemeral"} here


Workaround confirmed: switching to `sessionKeyStrategy: "fixed"` with `sessionKey: "paperclip"` stabilizes `chat_id` and enables cross-run caching. Tradeoff: all runs share one session and accumulate conversation history across tasks.

Impact and severity

Affected: All users running openclaw_gateway adapters with sessionKeyStrategy: "issue" (or any per-task session scoping) on direct Anthropic API Severity: High — cross-run prompt caching is completely non-functional; every task pays full cold-start token costs Frequency: Always (100% reproducible with per-task sessionKeyStrategy) Consequence: Meaningful extra cost — at ~20K tokens cold-start × Anthropic Sonnet input pricing × multiple Paperclip tasks per day during active sprint work. With cross-run caching working, T2+ runs within the TTL window pay ~0.1x the cold-start cost. High-leverage fix for any user running OpenClaw as a backend task agent rather than an interactive chat assistant.

Additional information

Recommended Fix

One or more of the following, in rough order of preference:

  1. Stabilize chat_id for backend/adapter channels. For channel == "paperclip" (or any non-chat, non-messaging channel), chat_id has no user-facing display purpose. Replace it with a stable identifier — the agent ID, the channel name, or omit it entirely. This is a minimal, targeted fix.

  2. Move per-run identifiers to the user turn rather than the system prompt. The wake message already carries full task context (PAPERCLIP_TASK_ID, issue_id, etc.); the system prompt doesn't need chat_id for task routing.

  3. Add a second cache_control breakpoint (the #49700 static/volatile split) so the stable prefix (tools, skills, workspace files) is cached independently of the volatile suffix. This is the highest-leverage fix and also addresses MEMORY.md churn between sessions.

  4. Config knob to suppress chat_id from inbound_meta per adapter or per channel type.

Related Issues

  • #49700 — static/volatile system prompt split (would address this more cleanly at the architectural level)
  • #19279, #37325 — resolveCacheRetention() provider string validation
  • #48753 — bootstrap truncation warnings moved out of system prompt (same class of fix: volatile content removed from system prompt)
  • #21785 — dynamic inbound flags moved from system metadata to user-context (same class of fix)

extent analysis

TL;DR

Stabilizing the chat_id field in the system prompt structure is likely to fix the cross-run prompt caching issue for task-scoped adapters.

Guidance

  1. Replace chat_id with a stable identifier: For non-chat channels like "paperclip", consider replacing chat_id with a stable identifier such as the agent ID or channel name to prevent cache busting.
  2. Move per-run identifiers to the user turn: Instead of including chat_id in the system prompt, move per-run identifiers to the user turn where they can be processed without affecting the cache.
  3. Add a second cache control breakpoint: Implementing a second cache_control breakpoint can help cache the stable prefix of the system prompt independently of the volatile suffix, addressing the caching issue.
  4. Configure adapters to suppress chat_id: Consider adding a config knob to suppress chat_id from inbound_meta per adapter or channel type to mitigate the caching issue.

Example

No specific code example is provided as the issue is more related to the design and structure of the system prompt and caching mechanism.

Notes

The provided guidance is based on the information given in the issue and may require further modifications or testing to fully resolve the caching issue. The recommended fixes are in rough order of preference, with stabilizing chat_id being the most minimal and targeted solution.

Recommendation

Apply a workaround by stabilizing the chat_id field, as this is a minimal and targeted fix that can address the caching issue without requiring significant changes to the system architecture.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Cross-run prompt caching should work for task-scoped adapters. With cacheRetention: "long", T2+ runs within the TTL window should show cacheRead dominating and near-zero input token costs, as observed with stable-chat_id adapters like Telegram.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING