openclaw - 💡(How to fix) Fix Haiku requests show 0% cache hit rate — is prompt caching applied for anthropic/claude-haiku-4-5?

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Our Anthropic usage reports show 0% cache hit rate on Haiku 4.5 requests routed through the OpenClaw gateway, while Sonnet 4.6 consistently hits 44–71% on the same days. Asking whether this is expected behavior or a configuration issue.

Root Cause

Our Anthropic usage reports show 0% cache hit rate on Haiku 4.5 requests routed through the OpenClaw gateway, while Sonnet 4.6 consistently hits 44–71% on the same days. Asking whether this is expected behavior or a configuration issue.

Fix Action

Fix / Workaround

Happy to help

  • Can share cache-trace output (diagnostics.cacheTrace.enabled) for a Haiku run if that helps debugging.
  • Can test a pre-release or config patch.

Code Example

models:
    default: anthropic/claude-sonnet-4-6
RAW_BUFFERClick to expand / collapse

Summary

Our Anthropic usage reports show 0% cache hit rate on Haiku 4.5 requests routed through the OpenClaw gateway, while Sonnet 4.6 consistently hits 44–71% on the same days. Asking whether this is expected behavior or a configuration issue.

Environment

  • OpenClaw: 2026.4.11 (769908e) (Homebrew cask, macOS 15.x)
  • Gateway: openclaw-gateway running as LaunchAgent
  • Node runtime
  • Config snippet (models):
    models:
      default: anthropic/claude-sonnet-4-6
  • Multiple agents configured with Haiku as primary or fallback.

What we observed

Queried Anthropic Admin API (/v1/organizations/usage_report/messages?group_by[]=model) for April 15–17, 2026:

DateModelUncachedCache-Write (5m)Cache-ReadHit-Rate
2026-04-15Haiku 4.549,517000.0%
2026-04-15Sonnet 4.634,61873,142,955165,725,73569.4%
2026-04-16Haiku 4.546,068000.0%
2026-04-16Sonnet 4.610,84022,254,21917,769,05544.4%
2026-04-17Haiku 4.534,817000.0%
2026-04-17Sonnet 4.616,85318,164,21245,390,55371.4%

Haiku shows zero cache_creation and zero cache_read_input_tokens across all observed days.

What we expect

`cache_control: {"type": "ephemeral"}` breakpoints should be applied to system prompts for Haiku requests the same way they are for Sonnet, given that:

  • Haiku 4.5 supports prompt caching per Anthropic's API docs.
  • Our Haiku agents load sizeable skill prompts (>4k tokens) that would benefit.

Questions

  1. Is prompt caching applied automatically for Haiku requests routed through the gateway?
  2. If yes: are there size/cost thresholds that skip caching when the gateway deems it uneconomical? (Haiku write premium vs. break-even might explain auto-skip.)
  3. If no: is there a config option to opt in (e.g. models.providers.anthropic.caching.haiku: true)?
  4. Is this behavior documented somewhere we missed?

Impact

At current Haiku volume (~30–50k input tokens/day across our agents), enabling caching for stable skill content would reduce input cost 80–90% on repeats and cut TTFT. Over a month this is minor in absolute euros (€10–20), but matters for always-on agents where latency is perceived.

Happy to help

  • Can share cache-trace output (diagnostics.cacheTrace.enabled) for a Haiku run if that helps debugging.
  • Can test a pre-release or config patch.

Thanks!

extent analysis

TL;DR

The most likely fix is to configure prompt caching for Haiku requests, potentially by adding a config option to opt-in, as the current setup shows 0% cache hit rate for Haiku 4.5 requests.

Guidance

  • Investigate if there's a config option to enable prompt caching for Haiku requests, such as models.providers.anthropic.caching.haiku: true, as the current config snippet only specifies the default model.
  • Check the Anthropic API documentation and OpenClaw gateway documentation for any size or cost thresholds that might be skipping caching for Haiku requests.
  • Verify if prompt caching is applied automatically for Haiku requests by checking the cache-trace output (diagnostics.cacheTrace.enabled) for a Haiku run.
  • Test a pre-release or config patch to enable caching for Haiku requests and monitor the cache hit rate.

Example

No code snippet is provided as the issue is related to configuration and caching behavior.

Notes

The issue might be related to the specific configuration of the OpenClaw gateway and the Anthropic API, and further investigation is needed to determine the root cause. The provided information suggests that Haiku 4.5 supports prompt caching, but it's not being applied.

Recommendation

Apply a workaround by configuring prompt caching for Haiku requests, as the current behavior is likely due to a configuration issue rather than an expected behavior. This can be done by adding a config option to opt-in to caching for Haiku requests.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING