openclaw - ✅(Solved) Fix feat: prompt cache keep-warm pings to prevent TTL expiry between turns [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#62475Fetched 2026-04-08 03:03:46
View on GitHub
Comments
0
Participants
1
Timeline
7
Reactions
0
Participants
Timeline (top)
referenced ×5cross-referenced ×2

Fix Action

Fixed

PR fix notes

PR #62179: feat: expose prompt-cache runtime context to context engines

Description (problem / solution / changelog)

Summary

  • Problem: context engines only receive messages, token budget, and runtime context, so they cannot inspect OpenClaw prompt-cache telemetry when deciding post-turn maintenance or compaction.
  • Why it matters: engines like lossless-claw need resolved cache retention, normalized last-call usage, and cache-break observations to make cache-aware decisions without duplicating embedded-runner logic.
  • What changed: added a typed runtimeContext.promptCache payload, populated it in the embedded runner for afterTurn(), and forwarded it into compact() on the existing recovery paths that already have the attempt result.
  • What did NOT change (scope boundary): no compaction policy changed, no lossless-claw behavior changed, and no expiry timestamps are invented for providers where OpenClaw cannot know them confidently.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: N/A
  • Missing detection / guardrail: N/A
  • Contributing context (if known): N/A

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts, src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts, and the existing buildAfterTurnRuntimeContext tests in src/agents/pi-embedded-runner/run/attempt.test.ts.
  • Scenario the test should lock in: afterTurn() receives prompt-cache info when available, the field stays absent/partial when unavailable, retention and cache-break observations are threaded through correctly, and compact() receives the same payload on the current recovery paths.
  • Why this is the smallest reliable guardrail: the behavior is owned by the embedded-runner/context-engine seam, so extending the existing attempt helper and recovery-path harnesses exercises the actual plumbing without adding broad integration coverage.
  • Existing test that already covers this (if any): existing buildAfterTurnRuntimeContext coverage remains in place and stays compatible with callers that ignore the new field.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

None.

Diagram (if applicable)

Before:
[embedded runner finishes turn] -> [afterTurn(runtimeContext without prompt-cache data)]

After:
[embedded runner finishes turn]
  -> [resolve effective retention + normalize last-call usage + finalize cache observation]
  -> [afterTurn(runtimeContext.promptCache)]
  -> [retry/timeout compaction path reuses same promptCache payload]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local Node/pnpm repo workflow
  • Model/provider: N/A
  • Integration/channel (if any): embedded runner / context-engine seam
  • Relevant config (redacted): none required beyond test defaults

Steps

  1. Run pnpm test src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts.
  2. Run pnpm test src/agents/pi-embedded-runner/run/attempt.test.ts -t "buildAfterTurnRuntimeContext".
  3. Confirm the prompt-cache payload appears only where expected and existing callers still pass.

Expected

  • Targeted tests pass.
  • runtimeContext.promptCache is available to context engines when the embedded runner has cache telemetry.
  • Existing behavior is unchanged for engines that ignore the new field.

Actual

  • Matched expected results locally.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: ran the targeted attempt/context-engine and timeout-compaction tests, plus the focused buildAfterTurnRuntimeContext coverage, against the final committed tree.
  • Edge cases checked: missing cache data leaves promptCache absent, effective retention is the resolved value, and cache-break observations carry through when the heuristic reports a meaningful drop.
  • What you did not verify: no broader end-to-end model-provider run, and no lossless-claw compaction policy changes in this PR.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: compact() only receives prompt-cache info on recovery paths that already have an attempt result.
    • Mitigation: this keeps the change additive and aligned with the current call graph; broader compaction-context sourcing can be added later if an engine needs it.
  • Risk: expiresAt is intentionally absent for Anthropic/OpenAI because OpenClaw cannot know it confidently.
    • Mitigation: the new type makes the field optional so engines can branch on real availability instead of guessed data.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/pi-embedded-runner/cache-ttl.test.ts (modified, +56/-1)
  • src/agents/pi-embedded-runner/cache-ttl.ts (modified, +34/-1)
  • src/agents/pi-embedded-runner/extensions.ts (modified, +4/-1)
  • src/agents/pi-embedded-runner/run.overflow-compaction.test.ts (modified, +52/-0)
  • src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts (modified, +25/-0)
  • src/agents/pi-embedded-runner/run.ts (modified, +2/-0)
  • src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts (modified, +32/-25)
  • src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts (modified, +180/-0)
  • src/agents/pi-embedded-runner/run/attempt.spawn-workspace.test-support.ts (modified, +41/-1)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +115/-18)
  • src/agents/pi-embedded-runner/run/types.ts (modified, +2/-1)
  • src/context-engine/types.ts (modified, +45/-0)

PR #62501: feat: add session activity timestamps for cache-aware maintenance

Description (problem / solution / changelog)

Summary

Add three optional timestamp fields to SessionEntry that track when the last user message arrived, when the last assistant reply was persisted, and when the last prompt-cache touch occurred.

These timestamps are the foundational infrastructure for time-based session maintenance features such as cache keep-warm (#62475).

Changes

src/config/sessions/types.ts

Added three optional fields to SessionEntry:

  • lastUserMessageAt?: number — epoch ms of last user message received
  • lastAssistantMessageAt?: number — epoch ms of last assistant reply completed
  • lastCacheTouchAt?: number — epoch ms of last confirmed prompt-cache touch (write or read > 0)

src/agents/agent-command.ts

Sets lastUserMessageAt = Date.now() when a user message is received (in the session-persist block before the agent run).

src/auto-reply/reply/session-usage.ts

Sets lastAssistantMessageAt = Date.now() on every turn completion. Sets lastCacheTouchAt = Date.now() when cacheRead > 0 || cacheWrite > 0.

src/agents/command/session-store.ts

Same logic for the pi-embedded agent path (updateSessionStoreAfterAgentRun): sets lastAssistantMessageAt on every turn with usage, sets lastCacheTouchAt when cache counters are non-zero.

src/config/sessions/session-activity-timestamps.test.ts

8 tests covering all three fields: type presence, optional defaults, merge behavior, and the cache-touch conditional logic.

Constraints

  • ✅ Purely additive — no existing behavior changed
  • ✅ All fields optional — no breaking changes
  • ✅ Uses Date.now() consistently with existing updatedAt pattern
  • ✅ All 8 new tests pass

References

  • Closes: #62475 (cache keep-warm foundation)
  • Inspired by: #56575 (JeremyGuo)

Changed files

  • src/agents/agent-command.ts (modified, +1/-0)
  • src/agents/command/session-store.ts (modified, +4/-0)
  • src/auto-reply/reply/session-usage.ts (modified, +4/-0)
  • src/config/sessions/session-activity-timestamps.test.ts (added, +119/-0)
  • src/config/sessions/types.ts (modified, +6/-0)

Code Example

// openclaw.json
{
  "agents": {
    "defaults": {
      "promptCacheKeepWarm": {
        "enabled": false,           // opt-in
        "maxCostPerHourUSD": 0.10,  // hard cap, auto-disables
        "durationMs": 3600000,      // stop after 1h of inactivity
        "providers": ["anthropic"]
      }
    }
  }
}
RAW_BUFFERClick to expand / collapse

Problem

Anthropic prompt cache expires after ~5 min (cacheRetention: "short") or ~1 hour (cacheRetention: "long"). Between user messages, when no API call occurs, the cache goes cold. The next turn re-caches the entire system prompt prefix + conversation history at write pricing ($3.75/MTok) instead of read pricing ($0.375/MTok) — a 10× cost multiplier on every cache miss.

For sessions with moderate inter-message gaps (5–60 min), this is the dominant cost driver. A typical 84k system prompt + 80k conversation history = ~164k cached tokens. Each cache miss costs $0.62 vs a cache hit at $0.062.

Prior Art

This resurrects the core idea from #24711 (closed/stale) and addresses part of #57569 (cache pre-warming after reconnect).

Proposed Solution

A lightweight, invisible, non-turn mechanism that fires minimal API pings (max_tokens: 1) to keep the prompt cache warm between user interactions.

Key Design Decisions

  • Adaptive interval: Computed automatically as TTL × 0.8 (4 min for short, ~48 min for long). Not user-configurable — eliminates misconfiguration class.
  • Not a heartbeat: No agent turn, no tool calls, no compaction, no session history entry.
  • Compaction-aware: Dirty flag cancels pending pings when system prompt changes (prevents warming stale cache keys).
  • Async verification: Check cache_read_input_tokens in ping response to confirm cache hit.
  • Hard cost cap: Default $0.10/hr, auto-disables if exceeded.
  • Same payload construction path: Ping uses identical system blocks + tools as real turns via resolveAnthropicPayloadPolicy() + applyAnthropicCacheControlToSystem() to match the cache key exactly.

Config

// openclaw.json
{
  "agents": {
    "defaults": {
      "promptCacheKeepWarm": {
        "enabled": false,           // opt-in
        "maxCostPerHourUSD": 0.10,  // hard cap, auto-disables
        "durationMs": 3600000,      // stop after 1h of inactivity
        "providers": ["anthropic"]
      }
    }
  }
}

Lifecycle

  1. After assistant reply → capture PromptCacheSnapshot, start interval timer
  2. On timer fire → check dirty flag, check cost cap, build minimal payload, fire ping
  3. On user message → cancel timer (real turn takes over)
  4. On compaction/model-switch → set dirty flag (next tick cancels)
  5. On session close / SIGTERM → cancel all timers

Build-On Opportunity

PR #56575 already adds session timestamps (lastUserMessageAt, lastAssistantMessageAt, lastCacheTouchAt) and a gateway maintenance sweep timer. This feature could be implemented as an additional mode ("ping") alongside the existing "compact" and "reset" modes, reusing the timer infrastructure.

PR #62179 (exposes runtimeContext.promptCache to context engines) provides the cache-break detection needed for dirty flag awareness.

Cost Analysis

ConfigPing costCache miss costSavings per avoided miss
84k tokens, short TTL~$0.032/ping$0.315/miss$0.28
164k tokens (long conv), long TTL~$0.062/ping$0.615/miss$0.55

With cacheRetention: "long", keep-warm fires ~1 ping per gap period. Break-even at 1 avoided cache miss per session.

Provider Abstraction

  • Anthropic: Full message, max_tokens: 1 (no lighter endpoint exists)
  • OpenAI: No-op (automatic caching, nothing to warm)
  • Google: Future — cache TTL extension API call (different mechanism)

Upstream Dream Ask

An Anthropic POST /cache/refresh endpoint that extends TTL without requiring a full message payload would reduce ping cost ~99%.

References

  • #24711 — original keepalive ping proposal (closed/stale)
  • #57569 — channel reconnect cache preservation
  • #56575 — time-based session maintenance (build-on candidate)
  • #62179 — prompt cache runtime context for context engines
  • #59054 — OPENCLAW_CACHE_BOUNDARY (merged, foundation)
  • #61422 — context files before boundary (cache stability)
  • #26750 — dynamic date in stable prefix (cache stability)

extent analysis

TL;DR

Implement a lightweight, invisible, non-turn mechanism that fires minimal API pings to keep the prompt cache warm between user interactions, reducing the 10× cost multiplier on every cache miss.

Guidance

  • Implement the proposed promptCacheKeepWarm feature with an adaptive interval computed as TTL × 0.8 to minimize cache misses and reduce costs.
  • Configure the promptCacheKeepWarm settings in openclaw.json to enable the feature, set the maximum cost per hour, and specify the duration and providers.
  • Use the resolveAnthropicPayloadPolicy() and applyAnthropicCacheControlToSystem() functions to construct the ping payload and match the cache key exactly.
  • Monitor the cache_read_input_tokens in the ping response to confirm cache hits and adjust the configuration as needed.

Example

// openclaw.json example configuration
{
  "agents": {
    "defaults": {
      "promptCacheKeepWarm": {
        "enabled": true,
        "maxCostPerHourUSD": 0.10,
        "durationMs": 3600000,
        "providers": ["anthropic"]
      }
    }
  }
}

Notes

The proposed solution relies on the resolveAnthropicPayloadPolicy() and applyAnthropicCacheControlToSystem() functions, which may require additional implementation or modification. The promptCacheKeepWarm feature should be thoroughly tested to ensure it works as expected and does not introduce any unintended consequences.

Recommendation

Apply the proposed promptCacheKeepWarm feature to reduce the cost multiplier on cache misses, as it provides a significant cost savings per avoided miss. This feature can be implemented as an additional mode alongside the existing modes, reusing the timer infrastructure.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix feat: prompt cache keep-warm pings to prevent TTL expiry between turns [2 pull requests, 1 participants]