litellm - 💡(How to fix) Fix [Bug]: bedrock Anthropic: cache_creation_input_token_cost and cache_read_input_token_cost silently billed at $0

litellm2026-05-28 11:34:29

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

This is not a minor rounding error. Due to this bug, all cost tracking, budget enforcement, team spend limits, and usage-based access controls in LiteLLM are rendered ineffective for any bedrock_converse model that uses prompt caching.

Code Example

"input_cost_per_token":            0.0000033,
  "output_cost_per_token":           0.0000165,                                                                                                                                                                    
  "cache_read_input_token_cost":     3.3e-7,                                                                                                                                                                       
  "cache_creation_input_token_cost": 0.000004125

---

text_tokens:                  1
  cache_creation_input_tokens:  1190  (ephemeral_5m)                                                                                                                                                               
  cache_read_input_tokens:     32634
  completion_tokens:              672

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

For models using the bedrock_converse provider with prompt caching enabled, both cache_creation_input_token_cost and cache_read_input_token_cost are completely ignored by the cost calculator — even when both fields are correctly present in the model map with non-zero values.

cost_breakdown.input_cost only reflects uncached text_tokens at the base input_cost_per_token rate. Cache creation tokens and cache read tokens each contribute $0.00, causing massive and silent
under-billing.

Affected versions: 1.84.1, 1.85.1, 1.86.1 , 1.86.2
Provider: bedrock_converse
Model: eu.anthropic.claude-* (sonnet, haiuku, Opos 4.5, 4.6 4.7)

Model config (as reported in `model_map_information` of spend logs)

"input_cost_per_token":            0.0000033,
"output_cost_per_token":           0.0000165,                                                                                                                                                                    
"cache_read_input_token_cost":     3.3e-7,                                                                                                                                                                       
"cache_creation_input_token_cost": 0.000004125

All four fields are present and correct. The calculator ignores the bottom two entirely and applies wrong rates for the top two.

Reproduction

text_tokens:                  1
cache_creation_input_tokens:  1190  (ephemeral_5m)                                                                                                                                                               
cache_read_input_tokens:     32634
completion_tokens:              672

Token type	Tokens	Rate	Expected cost
Regular input	1	$0.0000033	$0.0000033
Cache creation	1,190	$0.000004125	$0.004909
Cache read	32,634	$0.00000033	$0.010769
Output	672	$0.0000165	$0.011088
Expected total			$0.026769

Reported: input_cost: $0.000003 / output_cost: $0.010080 / total: $0.010083
Under-billed by ~$0.0167 (62%)

Why adding cache costs to `model_info` in config.yaml does NOT fix this

The model_map_information in the spend logs already contains the correct cache cost fields with non-zero values. The issue is not missing configuration — the cost calculator simply does not read
cache_creation_input_tokens and cache_read_input_tokens when computing input_cost for bedrock_converse models.

⚠️ Critical Billing Failure — Budget Limits and Spend Controls Are Ineffective

Concrete impact:

Team budgets configured in LiteLLM are not enforced — teams can vastly exceed their allocated spend without LiteLLM intervening
Per-API-key spend limits are ineffective for cache-heavy workloads
Cost dashboards and spend reports show a fraction of real costs
Operators have no reliable visibility into actual AWS Bedrock expenditure through LiteLLM

Enterprise Support Request

We are a LiteLLM Enterprise customer and are urgently requesting prioritized attention on this issue.

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.86.2

Twitter / LinkedIn details

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix [Bug]: bedrock Anthropic: cache_creation_input_token_cost and cache_read_input_token_cost silently billed at $0

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Check for existing issues

What happened?

What happened?

Model config (as reported in `model_map_information` of spend logs)

Reproduction

Why adding cache costs to `model_info` in config.yaml does NOT fix this

⚠️ Critical Billing Failure — Budget Limits and Spend Controls Are Ineffective

Enterprise Support Request

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix [Bug]: bedrock Anthropic: cache_creation_input_token_cost and cache_read_input_token_cost silently billed at $0

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Check for existing issues

What happened?

What happened?

Model config (as reported in model_map_information of spend logs)

Reproduction

Why adding cache costs to model_info in config.yaml does NOT fix this

⚠️ Critical Billing Failure — Budget Limits and Spend Controls Are Ineffective

Enterprise Support Request

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Still need to ship something?

TRENDING

Model config (as reported in `model_map_information` of spend logs)

Why adding cache costs to `model_info` in config.yaml does NOT fix this