litellm - ✅(Solved) Fix [Feature]: DashScope provider should preserve cache_control for explicit prompt caching [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#25330Fetched 2026-04-09 07:52:45
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
referenced ×3cross-referenced ×1labeled ×1

Fix Action

Fixed

PR fix notes

PR #25331: feat(dashscope): preserve cache_control for explicit prompt caching

Description (problem / solution / changelog)

Relevant issues

Fixes #25330

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

The DashScope provider inherits OpenAIGPTConfig, which strips all cache_control fields from messages and tools by default via remove_cache_control_flag_from_messages_and_tools(). This prevents users from using explicit prompt caching with DashScope-hosted models that support it.

This PR overrides remove_cache_control_flag_from_messages_and_tools() in DashScopeChatConfig to preserve cache_control fields, following the exact same pattern already used by:

  • ZAI (litellm/llms/zai/chat/transformation.py)
  • MiniMax (litellm/llms/minimax/chat/transformation.py)
  • Databricks (litellm/llms/databricks/chat/transformation.py)

This change is safe for models that don't use cache_control — if no cache_control field is present, the behavior is identical to before.

Files changed

  • litellm/llms/dashscope/chat/transformation.py — Added override method
  • tests/test_litellm/llms/dashscope/test_dashscope_chat_transformation.py — Added 2 tests for cache_control preservation in messages and tools

Verification

Beyond the unit tests, this change was validated with live 10-round multi-turn conversation tests against the DashScope API:

Explicit caching works correctly:

  • With cache_control markers on user messages, cached_tokens grows each round as conversation history accumulates, and cache_creation_input_tokens is reported on initial cache build.
  • Cache hits begin from the first round after prompt tokens exceed the 1024-token threshold.

Implicit caching is not affected:

  • Models relying on implicit prefix-matching caching produce identical cached_tokens values with and without this change, confirmed by running the same conversation against both the reverted codebase and the patched codebase.
  • Results were further cross-validated by comparing litellm output against direct API calls (bypassing litellm entirely via raw HTTP requests to the same /compatible-mode/v1/chat/completions endpoint). The cached_tokens matched on every round.

No regressions on non-caching models:

  • Models that do not support explicit caching were tested with cache_control present in the request. The DashScope API silently ignores the unrecognized field — no errors or behavioral changes observed.

Changed files

  • litellm/llms/dashscope/chat/transformation.py (modified, +14/-0)
  • tests/test_litellm/llms/dashscope/test_dashscope_chat_transformation.py (modified, +44/-0)
RAW_BUFFERClick to expand / collapse

The Feature

The DashScope provider currently inherits the default remove_cache_control_flag_from_messages_and_tools() from OpenAIGPTConfig, which strips all cache_control fields from messages and tools before sending requests to the API. This prevents users from leveraging explicit prompt caching with DashScope-hosted models that support it.

The fix is to override remove_cache_control_flag_from_messages_and_tools() in DashScopeChatConfig to preserve cache_control fields, following the same pattern already used by ZAI, MiniMax, and Databricks providers.

Motivation

DashScope hosts models from multiple providers, some of which support explicit prompt caching via cache_control fields (Anthropic-style). Currently, even if a user passes cache_control in their messages or tools, it gets silently stripped by the base class before the request reaches the API, making explicit prompt caching impossible on the DashScope provider.

Other OpenAI-compatible providers (ZAI, MiniMax, Databricks) have already solved this by overriding the method to preserve cache_control. DashScope should follow the same pattern for consistency.

extent analysis

TL;DR

Override the remove_cache_control_flag_from_messages_and_tools() method in DashScopeChatConfig to preserve cache_control fields.

Guidance

  • Identify the remove_cache_control_flag_from_messages_and_tools() method in the OpenAIGPTConfig class and understand its current behavior.
  • Override this method in DashScopeChatConfig to allow cache_control fields to be sent with requests to the API.
  • Review the implementations of remove_cache_control_flag_from_messages_and_tools() in ZAI, MiniMax, and Databricks providers for reference.
  • Test the updated DashScopeChatConfig to ensure that cache_control fields are no longer stripped from messages and tools.

Example

class DashScopeChatConfig(OpenAIGPTConfig):
    def remove_cache_control_flag_from_messages_and_tools(self):
        # Custom implementation to preserve cache_control fields
        pass

Notes

This solution assumes that the remove_cache_control_flag_from_messages_and_tools() method is the only obstacle to using explicit prompt caching with DashScope-hosted models.

Recommendation

Apply workaround: Override the remove_cache_control_flag_from_messages_and_tools() method in DashScopeChatConfig to preserve cache_control fields, as this will allow users to leverage explicit prompt caching with DashScope-hosted models that support it.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING