openclaw - ✅(Solved) Fix feat: prompt cache keep-warm pings to prevent TTL expiry between turns [2 pull requests, 1 participants]

100yenadmin · 2026-04-07T12:21:22Z

[openclaw] PR 62179: feat: expose prompt-cache runtime context to context engines - Repository: openclaw/openclaw - Author: jalehman - State: closed | merged:… # PR #62179: feat: expose prompt-cache runtime context to context engines - Repository: openclaw/openclaw - Author: jalehman - State: closed | merged: True - Link: https://github.com/openclaw/openclaw/pull/62179 ## Description (problem / solution / changelog) ## Summary - Problem: context engines only receive messages, token budget, and runtime context, so they cannot inspect OpenClaw prompt-cache telemetry when deciding post-turn maintenance or compaction. - Why it matters: engines like lossless-claw need resolved cache retention, normalized last-call usage, and cache-break observations to make cache-aware decisions without duplicating embedded-runner logic. - What changed: added a typed `runtimeContext.promptCache` payload, populated it in the embedded runner for `afterTurn()`, and forwarded it into `compact()` on the existing recovery paths that already have the attempt result. - What did NOT change (scope boundary): no compaction policy changed, no lossless-claw behavior changed, and no expiry timestamps are invented for providers where OpenClaw cannot know them confidently. ## Change Type (select all) - [ ] Bug fix - [x] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [x] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes # - Related # - [ ] This PR fixes a bug or regression ## Root Cause (if applicable) - Root cause: N/A - Missing detection / guardrail: N/A - Contributing context (if known): N/A ## Regression Test Plan (if applicable) - Coverage level that should have caught this: - [x] Unit test - [x] Seam / integration test - [ ] End-to-end test - [ ] Existing coverage already sufficient - Target test or file: `src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts`, `src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts`, and the existing `buildAfterTurnRuntimeContext` tests in `src/agents/pi-embedded-runner/run/attempt.test.ts`. - Scenario the test should lock in: `afterTurn()` receives prompt-cache info when available, the field stays absent/partial when unavailable, retention and cache-break observations are threaded through correctly, and `compact()` receives the same payload on the current recovery paths. - Why this is the smallest reliable guardrail: the behavior is owned by the embedded-runner/context-engine seam, so extending the existing attempt helper and recovery-path harnesses exercises the actual plumbing without adding broad integration coverage. - Existing test that already covers this (if any): existing `buildAfterTurnRuntimeContext` coverage remains in place and stays compatible with callers that ignore the new field. - If no new test is added, why not: N/A ## User-visible / Behavior Changes None. ## Diagram (if applicable) ```text Before: [embedded runner finishes turn] -> [afterTurn(runtimeContext without prompt-cache data)] After: [embedded runner finishes turn] -> [resolve effective retention + normalize last-call usage + finalize cache observation] -> [afterTurn(runtimeContext.promptCache)] -> [retry/timeout compaction path reuses same promptCache payload] ``` ## Security Impact (required) - New permissions/capabilities? (No) - Secrets/tokens handling changed? (No) - New/changed network calls? (No) - Command/tool execution surface changed? (No) - Data access scope changed? (No) - If any `Yes`, explain risk + mitigation: ## Repro + Verification ### Environment - OS: macOS - Runtime/container: local Node/pnpm repo workflow - Model/provider: N/A - Integration/channel (if any): embedded runner / context-engine seam - Relevant config (redacted): none required beyond test defaults ### Steps 1. Run `pnpm test src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts`. 2. Run `pnpm test src/agents/pi-embedded-runner/run/attempt.test.ts -t "buildAfterTurnRuntimeContext"`. 3. Confirm the prompt-cache payload appears only where expected and existing callers still pass. ### Expected - Targeted tests pass. - `runtimeContext.promptCache` is available to context engines when the embedded runner has cache telemetry. - Existing behavior is unchanged for engines that ignore the new field. ### Actual - Matched expected results locally. ## Evidence - [x] Failing test/log before + passing after - [ ] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) ## Human Verification (required) - Verified scenarios: ran the targeted attempt/context-engine and timeout-compaction tests, plus the focused `buildAfterTurnRuntimeContext` coverage, against the fina

openclaw2026-04-07 12:21:22

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#62475•Fetched 2026-04-08 03:03:46

View on GitHub

Comments

Participants

Timeline

Reactions

Author

100yenadmin

Participants

100yenadmin

Timeline (top)

referenced ×5cross-referenced ×2

Fix Action

Fixed

Fixed by PR: feat: expose prompt-cache runtime context to context engines (https://github.com/openclaw/openclaw/pull/62179)
Fixed by PR: feat: add session activity timestamps for cache-aware maintenance (https://github.com/openclaw/openclaw/pull/62501)

PR fix notes

PR #62179: feat: expose prompt-cache runtime context to context engines

Repository: openclaw/openclaw
Author: jalehman
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/62179

Description (problem / solution / changelog)

Summary

Problem: context engines only receive messages, token budget, and runtime context, so they cannot inspect OpenClaw prompt-cache telemetry when deciding post-turn maintenance or compaction.
Why it matters: engines like lossless-claw need resolved cache retention, normalized last-call usage, and cache-break observations to make cache-aware decisions without duplicating embedded-runner logic.
What changed: added a typed runtimeContext.promptCache payload, populated it in the embedded runner for afterTurn(), and forwarded it into compact() on the existing recovery paths that already have the attempt result.
What did NOT change (scope boundary): no compaction policy changed, no lossless-claw behavior changed, and no expiry timestamps are invented for providers where OpenClaw cannot know them confidently.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related #
This PR fixes a bug or regression

Root Cause (if applicable)

Root cause: N/A
Missing detection / guardrail: N/A
Contributing context (if known): N/A

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts, src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts, and the existing buildAfterTurnRuntimeContext tests in src/agents/pi-embedded-runner/run/attempt.test.ts.
Scenario the test should lock in: afterTurn() receives prompt-cache info when available, the field stays absent/partial when unavailable, retention and cache-break observations are threaded through correctly, and compact() receives the same payload on the current recovery paths.
Why this is the smallest reliable guardrail: the behavior is owned by the embedded-runner/context-engine seam, so extending the existing attempt helper and recovery-path harnesses exercises the actual plumbing without adding broad integration coverage.
Existing test that already covers this (if any): existing buildAfterTurnRuntimeContext coverage remains in place and stays compatible with callers that ignore the new field.
If no new test is added, why not: N/A

User-visible / Behavior Changes

None.

Diagram (if applicable)

Before:
[embedded runner finishes turn] -> [afterTurn(runtimeContext without prompt-cache data)]

After:
[embedded runner finishes turn]
  -> [resolve effective retention + normalize last-call usage + finalize cache observation]
  -> [afterTurn(runtimeContext.promptCache)]
  -> [retry/timeout compaction path reuses same promptCache payload]

Security Impact (required)

New permissions/capabilities? (No)
Secrets/tokens handling changed? (No)
New/changed network calls? (No)
Command/tool execution surface changed? (No)
Data access scope changed? (No)
If any Yes, explain risk + mitigation:

Repro + Verification

Environment

OS: macOS
Runtime/container: local Node/pnpm repo workflow
Model/provider: N/A
Integration/channel (if any): embedded runner / context-engine seam
Relevant config (redacted): none required beyond test defaults

Steps

Run pnpm test src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts.
Run pnpm test src/agents/pi-embedded-runner/run/attempt.test.ts -t "buildAfterTurnRuntimeContext".
Confirm the prompt-cache payload appears only where expected and existing callers still pass.

Expected

Targeted tests pass.
runtimeContext.promptCache is available to context engines when the embedded runner has cache telemetry.
Existing behavior is unchanged for engines that ignore the new field.

Actual

Matched expected results locally.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios: ran the targeted attempt/context-engine and timeout-compaction tests, plus the focused buildAfterTurnRuntimeContext coverage, against the final committed tree.
Edge cases checked: missing cache data leaves promptCache absent, effective retention is the resolved value, and cache-break observations carry through when the heuristic reports a meaningful drop.
What you did not verify: no broader end-to-end model-provider run, and no lossless-claw compaction policy changes in this PR.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? (Yes)
Config/env changes? (No)
Migration needed? (No)
If yes, exact upgrade steps:

Risks and Mitigations

Risk: compact() only receives prompt-cache info on recovery paths that already have an attempt result.
- Mitigation: this keeps the change additive and aligned with the current call graph; broader compaction-context sourcing can be added later if an engine needs it.
Risk: expiresAt is intentionally absent for Anthropic/OpenAI because OpenClaw cannot know it confidently.
- Mitigation: the new type makes the field optional so engines can branch on real availability instead of guessed data.

Changed files

CHANGELOG.md (modified, +1/-0)
src/agents/pi-embedded-runner/cache-ttl.test.ts (modified, +56/-1)
src/agents/pi-embedded-runner/cache-ttl.ts (modified, +34/-1)
src/agents/pi-embedded-runner/extensions.ts (modified, +4/-1)
src/agents/pi-embedded-runner/run.overflow-compaction.test.ts (modified, +52/-0)
src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts (modified, +25/-0)
src/agents/pi-embedded-runner/run.ts (modified, +2/-0)
src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts (modified, +32/-25)
src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts (modified, +180/-0)
src/agents/pi-embedded-runner/run/attempt.spawn-workspace.test-support.ts (modified, +41/-1)
src/agents/pi-embedded-runner/run/attempt.ts (modified, +115/-18)
src/agents/pi-embedded-runner/run/types.ts (modified, +2/-1)
src/context-engine/types.ts (modified, +45/-0)

PR #62501: feat: add session activity timestamps for cache-aware maintenance

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/62501

Description (problem / solution / changelog)

Summary

Add three optional timestamp fields to SessionEntry that track when the last user message arrived, when the last assistant reply was persisted, and when the last prompt-cache touch occurred.

These timestamps are the foundational infrastructure for time-based session maintenance features such as cache keep-warm (#62475).

Changes

`src/config/sessions/types.ts`

Added three optional fields to SessionEntry:

lastUserMessageAt?: number — epoch ms of last user message received
lastAssistantMessageAt?: number — epoch ms of last assistant reply completed
lastCacheTouchAt?: number — epoch ms of last confirmed prompt-cache touch (write or read > 0)

`src/agents/agent-command.ts`

Sets lastUserMessageAt = Date.now() when a user message is received (in the session-persist block before the agent run).

`src/auto-reply/reply/session-usage.ts`

Sets lastAssistantMessageAt = Date.now() on every turn completion. Sets lastCacheTouchAt = Date.now() when cacheRead > 0 || cacheWrite > 0.

`src/agents/command/session-store.ts`

Same logic for the pi-embedded agent path (updateSessionStoreAfterAgentRun): sets lastAssistantMessageAt on every turn with usage, sets lastCacheTouchAt when cache counters are non-zero.

`src/config/sessions/session-activity-timestamps.test.ts`

8 tests covering all three fields: type presence, optional defaults, merge behavior, and the cache-touch conditional logic.

Constraints

✅ Purely additive — no existing behavior changed
✅ All fields optional — no breaking changes
✅ Uses Date.now() consistently with existing updatedAt pattern
✅ All 8 new tests pass

References

Closes: #62475 (cache keep-warm foundation)
Inspired by: #56575 (JeremyGuo)

Changed files

src/agents/agent-command.ts (modified, +1/-0)
src/agents/command/session-store.ts (modified, +4/-0)
src/auto-reply/reply/session-usage.ts (modified, +4/-0)
src/config/sessions/session-activity-timestamps.test.ts (added, +119/-0)
src/config/sessions/types.ts (modified, +6/-0)

Code Example

// openclaw.json
{
  "agents": {
    "defaults": {
      "promptCacheKeepWarm": {
        "enabled": false,           // opt-in
        "maxCostPerHourUSD": 0.10,  // hard cap, auto-disables
        "durationMs": 3600000,      // stop after 1h of inactivity
        "providers": ["anthropic"]
      }
    }
  }
}

RAW_BUFFERClick to expand / collapse

Problem

Anthropic prompt cache expires after ~5 min (cacheRetention: "short") or ~1 hour (cacheRetention: "long"). Between user messages, when no API call occurs, the cache goes cold. The next turn re-caches the entire system prompt prefix + conversation history at write pricing ($3.75/MTok) instead of read pricing ($0.375/MTok) — a 10× cost multiplier on every cache miss.

For sessions with moderate inter-message gaps (5–60 min), this is the dominant cost driver. A typical 84k system prompt + 80k conversation history = ~164k cached tokens. Each cache miss costs $0.62 vs a cache hit at $0.062.

Prior Art

This resurrects the core idea from #24711 (closed/stale) and addresses part of #57569 (cache pre-warming after reconnect).

Proposed Solution

A lightweight, invisible, non-turn mechanism that fires minimal API pings (max_tokens: 1) to keep the prompt cache warm between user interactions.

Key Design Decisions

Adaptive interval: Computed automatically as TTL × 0.8 (4 min for short, ~48 min for long). Not user-configurable — eliminates misconfiguration class.
Not a heartbeat: No agent turn, no tool calls, no compaction, no session history entry.
Compaction-aware: Dirty flag cancels pending pings when system prompt changes (prevents warming stale cache keys).
Async verification: Check cache_read_input_tokens in ping response to confirm cache hit.
Hard cost cap: Default $0.10/hr, auto-disables if exceeded.
Same payload construction path: Ping uses identical system blocks + tools as real turns via resolveAnthropicPayloadPolicy() + applyAnthropicCacheControlToSystem() to match the cache key exactly.

Config

// openclaw.json
{
  "agents": {
    "defaults": {
      "promptCacheKeepWarm": {
        "enabled": false,           // opt-in
        "maxCostPerHourUSD": 0.10,  // hard cap, auto-disables
        "durationMs": 3600000,      // stop after 1h of inactivity
        "providers": ["anthropic"]
      }
    }
  }
}

Lifecycle

After assistant reply → capture PromptCacheSnapshot, start interval timer
On timer fire → check dirty flag, check cost cap, build minimal payload, fire ping
On user message → cancel timer (real turn takes over)
On compaction/model-switch → set dirty flag (next tick cancels)
On session close / SIGTERM → cancel all timers

Build-On Opportunity

PR #56575 already adds session timestamps (lastUserMessageAt, lastAssistantMessageAt, lastCacheTouchAt) and a gateway maintenance sweep timer. This feature could be implemented as an additional mode ("ping") alongside the existing "compact" and "reset" modes, reusing the timer infrastructure.

PR #62179 (exposes runtimeContext.promptCache to context engines) provides the cache-break detection needed for dirty flag awareness.

Cost Analysis

Config	Ping cost	Cache miss cost	Savings per avoided miss
84k tokens, short TTL	~$0.032/ping	$0.315/miss	$0.28
164k tokens (long conv), long TTL	~$0.062/ping	$0.615/miss	$0.55

With cacheRetention: "long", keep-warm fires ~1 ping per gap period. Break-even at 1 avoided cache miss per session.

Provider Abstraction

Anthropic: Full message, max_tokens: 1 (no lighter endpoint exists)
OpenAI: No-op (automatic caching, nothing to warm)
Google: Future — cache TTL extension API call (different mechanism)

Upstream Dream Ask

An Anthropic POST /cache/refresh endpoint that extends TTL without requiring a full message payload would reduce ping cost ~99%.

References

#24711 — original keepalive ping proposal (closed/stale)
#57569 — channel reconnect cache preservation
#56575 — time-based session maintenance (build-on candidate)
#62179 — prompt cache runtime context for context engines
#59054 — OPENCLAW_CACHE_BOUNDARY (merged, foundation)
#61422 — context files before boundary (cache stability)
#26750 — dynamic date in stable prefix (cache stability)

extent analysis

TL;DR

Implement a lightweight, invisible, non-turn mechanism that fires minimal API pings to keep the prompt cache warm between user interactions, reducing the 10× cost multiplier on every cache miss.

Guidance

Implement the proposed promptCacheKeepWarm feature with an adaptive interval computed as TTL × 0.8 to minimize cache misses and reduce costs.
Configure the promptCacheKeepWarm settings in openclaw.json to enable the feature, set the maximum cost per hour, and specify the duration and providers.
Use the resolveAnthropicPayloadPolicy() and applyAnthropicCacheControlToSystem() functions to construct the ping payload and match the cache key exactly.
Monitor the cache_read_input_tokens in the ping response to confirm cache hits and adjust the configuration as needed.

Example

// openclaw.json example configuration
{
  "agents": {
    "defaults": {
      "promptCacheKeepWarm": {
        "enabled": true,
        "maxCostPerHourUSD": 0.10,
        "durationMs": 3600000,
        "providers": ["anthropic"]
      }
    }
  }
}

Notes

The proposed solution relies on the resolveAnthropicPayloadPolicy() and applyAnthropicCacheControlToSystem() functions, which may require additional implementation or modification. The promptCacheKeepWarm feature should be thoroughly tested to ensure it works as expected and does not introduce any unintended consequences.

Recommendation

Apply the proposed promptCacheKeepWarm feature to reduce the cost multiplier on cache misses, as it provides a significant cost savings per avoided miss. This feature can be implemented as an additional mode alongside the existing modes, reusing the timer infrastructure.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #conversation history #integration issue #index setup #retrieval issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix feat: prompt cache keep-warm pings to prevent TTL expiry between turns [2 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #62179: feat: expose prompt-cache runtime context to context engines

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Changed files

PR #62501: feat: add session activity timestamps for cache-aware maintenance

Description (problem / solution / changelog)

Summary

Changes

src/config/sessions/types.ts

src/agents/agent-command.ts

src/auto-reply/reply/session-usage.ts

src/agents/command/session-store.ts

src/config/sessions/session-activity-timestamps.test.ts

Constraints

References

Changed files

Code Example

Problem

Prior Art

Proposed Solution

Key Design Decisions

Config

Lifecycle

Build-On Opportunity

Cost Analysis

Provider Abstraction

Upstream Dream Ask

References

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

`src/config/sessions/types.ts`

`src/agents/agent-command.ts`

`src/auto-reply/reply/session-usage.ts`

`src/agents/command/session-store.ts`

`src/config/sessions/session-activity-timestamps.test.ts`