openclaw - 💡(How to fix) Fix Haiku requests show 0% cache hit rate — is prompt caching applied for anthropic/claude-haiku-4-5?

StepCodex · 2026-04-20T21:19:54Z

[openclaw] Our Anthropic usage reports show 0% cache hit rate on Haiku 4.5 requests routed through the OpenClaw gateway, while Sonnet 4.6 consistently hits 44–… Our Anthropic usage reports show **0% cache hit rate on Haiku 4.5 requests** routed through the OpenClaw gateway, while Sonnet 4.6 consistently hits 44–71% on the same days. Asking whether this is expected behavior or a configuration issue. ## Fix / Workaround ### Happy to help - Can share cache-trace output (`diagnostics.cacheTrace.enabled`) for a Haiku run if that helps debugging. - Can test a pre-release or config patch. ### Summary Our Anthropic usage reports show **0% cache hit rate on Haiku 4.5 requests** routed through the OpenClaw gateway, while Sonnet 4.6 consistently hits 44–71% on the same days. Asking whether this is expected behavior or a configuration issue. ### Environment - OpenClaw: `2026.4.11 (769908e)` (Homebrew cask, macOS 15.x) - Gateway: `openclaw-gateway` running as LaunchAgent - Node runtime - Config snippet (models): ```yaml models: default: anthropic/claude-sonnet-4-6 ``` - Multiple agents configured with Haiku as primary or fallback. ### What we observed Queried Anthropic Admin API (`/v1/organizations/usage_report/messages?group_by[]=model`) for April 15–17, 2026: | Date | Model | Uncached | Cache-Write (5m) | Cache-Read | Hit-Rate | |---|---|---|---|---|---| | 2026-04-15 | Haiku 4.5 | 49,517 | 0 | 0 | **0.0%** | | 2026-04-15 | Sonnet 4.6 | 34,618 | 73,142,955 | 165,725,735 | 69.4% | | 2026-04-16 | Haiku 4.5 | 46,068 | 0 | 0 | **0.0%** | | 2026-04-16 | Sonnet 4.6 | 10,840 | 22,254,219 | 17,769,055 | 44.4% | | 2026-04-17 | Haiku 4.5 | 34,817 | 0 | 0 | **0.0%** | | 2026-04-17 | Sonnet 4.6 | 16,853 | 18,164,212 | 45,390,553 | 71.4% | Haiku shows zero `cache_creation` and zero `cache_read_input_tokens` across all observed days. ### What we expect \`cache_control: {"type": "ephemeral"}\` breakpoints should be applied to system prompts for Haiku requests the same way they are for Sonnet, given that: - Haiku 4.5 supports prompt caching per Anthropic's API docs. - Our Haiku agents load sizeable skill prompts (>4k tokens) that would benefit. ### Questions 1. **Is prompt caching applied automatically** for Haiku requests routed through the gateway? 2. If yes: are there **size/cost thresholds** that skip caching when the gateway deems it uneconomical? (Haiku write premium vs. break-even might explain auto-skip.) 3. If no: is there a **config option** to opt in (e.g. `models.providers.anthropic.caching.haiku: true`)? 4. Is this behavior **documented** somewhere we missed? ### Impact At current Haiku volume (~30–50k input tokens/day across our agents), enabling caching for stable skill content would reduce input cost ~80–90% on repeats and cut TTFT. Over a month this is minor in absolute euros (~€10–20), but matters for always-on agents where latency is perceived. ### Happy to help - Can share cache-trace output (`diagnostics.cacheTrace.enabled`) for a Haiku run if that helps debugging. - Can test a pre-release or config patch. Thanks!

openclaw2026-04-20 21:19:54

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Our Anthropic usage reports show 0% cache hit rate on Haiku 4.5 requests routed through the OpenClaw gateway, while Sonnet 4.6 consistently hits 44–71% on the same days. Asking whether this is expected behavior or a configuration issue.

Root Cause

Fix Action

Fix / Workaround

Happy to help

Can share cache-trace output (diagnostics.cacheTrace.enabled) for a Haiku run if that helps debugging.
Can test a pre-release or config patch.

Code Example

models:
    default: anthropic/claude-sonnet-4-6

RAW_BUFFERClick to expand / collapse

Summary

Environment

OpenClaw: 2026.4.11 (769908e) (Homebrew cask, macOS 15.x)
Gateway: openclaw-gateway running as LaunchAgent
Node runtime

Config snippet (models):

models:
  default: anthropic/claude-sonnet-4-6

Multiple agents configured with Haiku as primary or fallback.

What we observed

Queried Anthropic Admin API (/v1/organizations/usage_report/messages?group_by[]=model) for April 15–17, 2026:

Date	Model	Uncached	Cache-Write (5m)	Cache-Read	Hit-Rate
2026-04-15	Haiku 4.5	49,517	0	0	0.0%
2026-04-15	Sonnet 4.6	34,618	73,142,955	165,725,735	69.4%
2026-04-16	Haiku 4.5	46,068	0	0	0.0%
2026-04-16	Sonnet 4.6	10,840	22,254,219	17,769,055	44.4%
2026-04-17	Haiku 4.5	34,817	0	0	0.0%
2026-04-17	Sonnet 4.6	16,853	18,164,212	45,390,553	71.4%

Haiku shows zero cache_creation and zero cache_read_input_tokens across all observed days.

What we expect

`cache_control: {"type": "ephemeral"}` breakpoints should be applied to system prompts for Haiku requests the same way they are for Sonnet, given that:

Haiku 4.5 supports prompt caching per Anthropic's API docs.
Our Haiku agents load sizeable skill prompts (>4k tokens) that would benefit.

Questions

Is prompt caching applied automatically for Haiku requests routed through the gateway?
If yes: are there size/cost thresholds that skip caching when the gateway deems it uneconomical? (Haiku write premium vs. break-even might explain auto-skip.)
If no: is there a config option to opt in (e.g. models.providers.anthropic.caching.haiku: true)?
Is this behavior documented somewhere we missed?

Impact

At current Haiku volume (~30–50k input tokens/day across our agents), enabling caching for stable skill content would reduce input cost ~~80–90% on repeats and cut TTFT. Over a month this is minor in absolute euros (~~€10–20), but matters for always-on agents where latency is perceived.

Happy to help

Can share cache-trace output (diagnostics.cacheTrace.enabled) for a Haiku run if that helps debugging.
Can test a pre-release or config patch.

Thanks!

extent analysis

TL;DR

The most likely fix is to configure prompt caching for Haiku requests, potentially by adding a config option to opt-in, as the current setup shows 0% cache hit rate for Haiku 4.5 requests.

Guidance

Investigate if there's a config option to enable prompt caching for Haiku requests, such as models.providers.anthropic.caching.haiku: true, as the current config snippet only specifies the default model.
Check the Anthropic API documentation and OpenClaw gateway documentation for any size or cost thresholds that might be skipping caching for Haiku requests.
Verify if prompt caching is applied automatically for Haiku requests by checking the cache-trace output (diagnostics.cacheTrace.enabled) for a Haiku run.
Test a pre-release or config patch to enable caching for Haiku requests and monitor the cache hit rate.

Example

No code snippet is provided as the issue is related to configuration and caching behavior.

Notes

The issue might be related to the specific configuration of the OpenClaw gateway and the Anthropic API, and further investigation is needed to determine the root cause. The provided information suggests that Haiku 4.5 supports prompt caching, but it's not being applied.

Recommendation

Apply a workaround by configuring prompt caching for Haiku requests, as the current behavior is likely due to a configuration issue rather than an expected behavior. This can be done by adding a config option to opt-in to caching for Haiku requests.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #configuration error #environment variable #network issue #logging issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Haiku requests show 0% cache hit rate — is prompt caching applied for anthropic/claude-haiku-4-5?

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Happy to help

Code Example

Summary

Environment

What we observed

What we expect

Questions

Impact

Happy to help

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Haiku requests show 0% cache hit rate — is prompt caching applied for anthropic/claude-haiku-4-5?

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Happy to help

Code Example

Summary

Environment

What we observed

What we expect

Questions

Impact

Happy to help

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING