claude-code - 💡(How to fix) Fix Expose `cache_control.ttl` (5m / 1h) as a user-configurable setting

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Expose cache_control.ttl (default 5m, options 5m / 1h) as a user-configurable setting in Claude Code, either via a settings.json key (e.g. "promptCacheTtl": "1h") or an environment variable (e.g. CLAUDE_CODE_PROMPT_CACHE_TTL).

Currently the binary appears to hardcode cache_control: {type: "ephemeral"} with no ttl field, which the Anthropic API treats as the 5-minute tier. Users with workflows that include >5-minute pauses (planning, code review, document writing) repeatedly pay the cache-create cost on resumption.

Error Message

Valid values: 5m (current default), 1h. Anything else → error at startup.

Root Cause

Expose cache_control.ttl (default 5m, options 5m / 1h) as a user-configurable setting in Claude Code, either via a settings.json key (e.g. "promptCacheTtl": "1h") or an environment variable (e.g. CLAUDE_CODE_PROMPT_CACHE_TTL).

Currently the binary appears to hardcode cache_control: {type: "ephemeral"} with no ttl field, which the Anthropic API treats as the 5-minute tier. Users with workflows that include >5-minute pauses (planning, code review, document writing) repeatedly pay the cache-create cost on resumption.

Code Example

"env": {
    "CLAUDE_CODE_PROMPT_CACHE_TTL": "1h"
  }

---

"promptCacheTtl": "1h"
RAW_BUFFERClick to expand / collapse

Summary

Expose cache_control.ttl (default 5m, options 5m / 1h) as a user-configurable setting in Claude Code, either via a settings.json key (e.g. "promptCacheTtl": "1h") or an environment variable (e.g. CLAUDE_CODE_PROMPT_CACHE_TTL).

Currently the binary appears to hardcode cache_control: {type: "ephemeral"} with no ttl field, which the Anthropic API treats as the 5-minute tier. Users with workflows that include >5-minute pauses (planning, code review, document writing) repeatedly pay the cache-create cost on resumption.

Motivation

I run Claude Code sessions on the claude-opus-4-7 model that frequently pause for 5+ minutes between turns (reading large outputs, planning multi-step work). When I measure the cache_create token usage in the OpenTelemetry logs my session emits, I see a clear pattern:

Inter-turn gapAvg cache_create tokens on next turn
< 5 min~10k
> 5 min~154k

That is a ~15× cache-miss penalty whenever a pause exceeds the 5-minute ephemeral TTL — which on a thinking-heavy day happens 10-20× per session. At opus pricing this is real money: on one 3-day window I measured roughly $25/day per active session burned on cache TTL expiry alone.

The Anthropic API supports "ttl": "1h" on the cache_control field, but Claude Code does not appear to set it, and I can find no environment variable or settings key that controls it. From strings of the 2.1.133 binary I see literal {type: "ephemeral"} and cache_control changed (scope or TTL), but no CLAUDE_CODE_*_CACHE_* or CLAUDE_CODE_*_TTL* env var name.

Proposed API surface

Two options (either works for me, both is fine):

  • settings.json:

    "env": {
      "CLAUDE_CODE_PROMPT_CACHE_TTL": "1h"
    }

    or top-level:

    "promptCacheTtl": "1h"
  • CLI flag (for one-off override): claude --prompt-cache-ttl 1h

Valid values: 5m (current default), 1h. Anything else → error at startup.

Cost model context

For my use case (sessions with cache reuse > 24% within an hour), the 1h tier is strictly cheaper than 5m, per the published price table:

  • 5m: cache_create = 1.25× input, cache_read = 0.1× input → breakeven at ~12% reuse
  • 1h: cache_create = 2.0× input, cache_read = 0.1× input → breakeven at ~24% reuse

Many real-world sessions clear 24% reuse easily, especially anything where the prefix (CLAUDE.md, AGENTS.md, tools) is large and stable.

What I'd like to avoid

Building a local TLS-terminating proxy that mutates outbound request bodies to inject ttl: "1h". That's where I'm headed if there's no first-class config — it works but it's a maintenance burden that scales linearly with API churn.

Environment

  • Claude Code 2.1.133 (also confirmed same behaviour on 2.1.132)
  • Ubuntu 24.04, brew install
  • claude-opus-4-7 model
  • Prompt caching enabled (default)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING