litellm - 💡(How to fix) Fix fix(anthropic): support top-level cache_control in /v1/messages for automatic prompt caching [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#26320Fetched 2026-04-24 05:52:32
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Participants
Timeline (top)
labeled ×1

PR #22442 added cache_control support to map_openai_params() in litellm/llms/anthropic/chat/transformation.py. This resolves the issue for the OpenAI-compatible path (/v1/chat/completions), but the native Anthropic path (/v1/messages) still rejects cache_control as a top-level field.

The failure occurs during Pydantic validation in the /v1/messages request model, before the request reaches the provider layer.


Error Message

cache_control: Extra inputs are not permitted

Full error:

status_code: 400
body: {'error': {'message': '{"message":"cache_control: Extra inputs are not permitted"}'}}

Root Cause

map_openai_params() (updated in #22442) is only invoked for the OpenAI-compatible path (/v1/chat/completions).

The native /v1/messages endpoint uses a separate Pydantic request model which does not include cache_control as an allowed top-level field. As a result, the request fails validation before reaching the transformation or provider layers.


Fix Action

Workaround

Block-level caching via:

  • anthropic_cache_instructions
  • anthropic_cache_tool_definitions

works correctly, since these embed cache_control inside content blocks that are already supported by the /v1/messages schema.


Code Example

cache_control: Extra inputs are not permitted

---

status_code: 400
body: {'error': {'message': '{"message":"cache_control: Extra inputs are not permitted"}'}}

---

AnthropicModelSettings(anthropic_cache="1h")

---

POST /v1/messages?beta=true
{
  "model": "us.anthropic.claude-sonnet-4-6",
  "messages": [
    {"role": "user", "content": "Hello"}
  ],
  "cache_control": {
    "type": "ephemeral",
    "ttl": "1h"
  }
}

---

cache_control: Optional[Dict[Literal["type", "ttl"], str]] = None
RAW_BUFFERClick to expand / collapse

Description

PR #22442 added cache_control support to map_openai_params() in litellm/llms/anthropic/chat/transformation.py. This resolves the issue for the OpenAI-compatible path (/v1/chat/completions), but the native Anthropic path (/v1/messages) still rejects cache_control as a top-level field.

The failure occurs during Pydantic validation in the /v1/messages request model, before the request reaches the provider layer.


Error

cache_control: Extra inputs are not permitted

Full error:

status_code: 400
body: {'error': {'message': '{"message":"cache_control: Extra inputs are not permitted"}'}}

Steps to Reproduce

Using pydantic-ai's AnthropicProvider pointing at a LiteLLM proxy, with:

AnthropicModelSettings(anthropic_cache="1h")

This results in the following request being sent:

POST /v1/messages?beta=true
{
  "model": "us.anthropic.claude-sonnet-4-6",
  "messages": [
    {"role": "user", "content": "Hello"}
  ],
  "cache_control": {
    "type": "ephemeral",
    "ttl": "1h"
  }
}

This request works when sent directly to Anthropic, but fails via LiteLLM proxy.

LiteLLM version: >=1.83.7-stable (tested on latest as of April 2026)


Root Cause

map_openai_params() (updated in #22442) is only invoked for the OpenAI-compatible path (/v1/chat/completions).

The native /v1/messages endpoint uses a separate Pydantic request model which does not include cache_control as an allowed top-level field. As a result, the request fails validation before reaching the transformation or provider layers.


Expected Behavior

cache_control should be accepted in the /v1/messages request schema and forwarded unchanged to the Anthropic provider, consistent with Anthropic's automatic prompt caching API:

https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#automatic-caching


Workaround

Block-level caching via:

  • anthropic_cache_instructions
  • anthropic_cache_tool_definitions

works correctly, since these embed cache_control inside content blocks that are already supported by the /v1/messages schema.


Suggested Fix

Add cache_control as an optional field to the /v1/messages request model (e.g., AnthropicMessagesRequest) and ensure it is forwarded to the Anthropic provider call.

Example:

cache_control: Optional[Dict[Literal["type", "ttl"], str]] = None

This should make behavior consistent across both /v1/chat/completions and /v1/messages.

extent analysis

TL;DR

Add cache_control as an optional field to the /v1/messages request model to resolve the Pydantic validation error.

Guidance

  • Identify the Pydantic request model used for the /v1/messages endpoint and add cache_control as an optional field.
  • Verify that the updated model correctly validates requests with cache_control and forwards it to the Anthropic provider.
  • Test the workaround using anthropic_cache_instructions or anthropic_cache_tool_definitions to ensure block-level caching works as expected.
  • Review the Anthropic API documentation to ensure the cache_control field is correctly formatted and supported.

Example

cache_control: Optional[Dict[Literal["type", "ttl"], str]] = None

This example illustrates how to add cache_control as an optional field to the request model.

Notes

The suggested fix assumes that the cache_control field is not already included in the /v1/messages request model. If it is, the issue may lie in the model's configuration or the validation process.

Recommendation

Apply the suggested fix by adding cache_control as an optional field to the /v1/messages request model, as this will make the behavior consistent across both /v1/chat/completions and /v1/messages endpoints.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING