litellm - 💡(How to fix) Fix Anthropic reasoning_effort silently dropped when passed as Reasoning(effort, summary) dict (regression in v1.85.0) [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

Step 1 — Responses→Chat parser keeps the full dict when summary is set (litellm/responses/litellm_completion_transformation/transformation.py:184-200):

reasoning_param = responses_api_request.get("reasoning")
if reasoning_param:
    if isinstance(reasoning_param, dict):
        if "summary" in reasoning_param:
            reasoning_effort = reasoning_param        # ← keeps WHOLE DICT
        elif "effort" in reasoning_param:
            reasoning_effort = reasoning_param.get("effort")  # ← extracts string

Step 2 — Anthropic transformation's guard fails for dict (litellm/llms/anthropic/chat/transformation.py:1509):

elif param == "reasoning_effort" and isinstance(value, str):  # ← isinstance fails
    mapped_thinking = AnthropicConfig._map_reasoning_effort(...)

thinking parameter is never set on the Anthropic request → model receives no thinking budget → 0 reasoning_tokens emitted.

Fix Action

Fixed

Code Example

reasoning_param = responses_api_request.get("reasoning")
if reasoning_param:
    if isinstance(reasoning_param, dict):
        if "summary" in reasoning_param:
            reasoning_effort = reasoning_param        # ← keeps WHOLE DICT
        elif "effort" in reasoning_param:
            reasoning_effort = reasoning_param.get("effort")  # ← extracts string

---

elif param == "reasoning_effort" and isinstance(value, str):  # ← isinstance fails
    mapped_thinking = AnthropicConfig._map_reasoning_effort(...)

---

if isinstance(value, str):
    return value
if isinstance(value, dict) and "effort" in value:
    return value["effort"]

---

import asyncio
import litellm

async def main():
    # Both calls request reasoning_effort="low" against the same model.
    # Only the first produces reasoning_tokens.

    r1 = await litellm.aresponses(
        model="anthropic/claude-sonnet-4-6",
        input=[{"role": "user", "content": "Think carefully: what is 17 * 23?"}],
        reasoning_effort="low",                          # string ← works
    )
    print("string form:", r1.usage.output_tokens_details)
    # → reasoning_tokens > 0, "reasoning" item in r1.output

    r2 = await litellm.aresponses(
        model="anthropic/claude-sonnet-4-6",
        input=[{"role": "user", "content": "Think carefully: what is 17 * 23?"}],
        reasoning=litellm.Reasoning(effort="low", summary="concise"),  # dict ← broken
    )
    print("Reasoning(summary) form:", r2.usage.output_tokens_details)
    # → reasoning_tokens == 0, no "reasoning" item in r2.output

asyncio.run(main())

---

string form:                 OutputTokensDetails(reasoning_tokens=187, ...)
Reasoning(summary) form:     OutputTokensDetails(reasoning_tokens=0, ...)

---

-            elif param == "reasoning_effort" and isinstance(value, str):
+            elif param == "reasoning_effort":
+                effort_value = value
+                if isinstance(effort_value, dict):
+                    effort_value = effort_value.get("effort")
+                if not isinstance(effort_value, str):
+                    continue
                 mapped_thinking = AnthropicConfig._map_reasoning_effort(
-                    reasoning_effort=value,
+                    reasoning_effort=effort_value,
                     model=model,
                     llm_provider=self.custom_llm_provider or "anthropic",
                 )
                 ...
                     if AnthropicConfig._is_adaptive_thinking_model(model):
                         mapped_effort = REASONING_EFFORT_TO_OUTPUT_CONFIG_EFFORT.get(
-                            value
+                            effort_value
                         )
RAW_BUFFERClick to expand / collapse

Bug

When calling litellm.aresponses(reasoning=Reasoning(effort="low", summary="concise"), model="anthropic/..."), the thinking parameter is silently dropped before reaching Anthropic. The model produces 0 reasoning_tokens and no thinking_blocks, even though reasoning_effort was requested.

Introduced by #25359 — that PR added an if "summary" in reasoning_param branch to the Responses→Chat parser, but the downstream Anthropic transformation still guards on isinstance(value, str).

Affected versions

v1.85.0+ (verified). v1.83.0 works correctly (the dict-keeping branch didn't exist yet).

Root cause

Step 1 — Responses→Chat parser keeps the full dict when summary is set (litellm/responses/litellm_completion_transformation/transformation.py:184-200):

reasoning_param = responses_api_request.get("reasoning")
if reasoning_param:
    if isinstance(reasoning_param, dict):
        if "summary" in reasoning_param:
            reasoning_effort = reasoning_param        # ← keeps WHOLE DICT
        elif "effort" in reasoning_param:
            reasoning_effort = reasoning_param.get("effort")  # ← extracts string

Step 2 — Anthropic transformation's guard fails for dict (litellm/llms/anthropic/chat/transformation.py:1509):

elif param == "reasoning_effort" and isinstance(value, str):  # ← isinstance fails
    mapped_thinking = AnthropicConfig._map_reasoning_effort(...)

thinking parameter is never set on the Anthropic request → model receives no thinking budget → 0 reasoning_tokens emitted.

Why GPT-5 isn't affected

OpenAI's _normalize_reasoning_effort_for_chat_completion accepts both string and dict:

if isinstance(value, str):
    return value
if isinstance(value, dict) and "effort" in value:
    return value["effort"]

The Anthropic side never got the same shape-tolerance update.

Minimal repro

import asyncio
import litellm

async def main():
    # Both calls request reasoning_effort="low" against the same model.
    # Only the first produces reasoning_tokens.

    r1 = await litellm.aresponses(
        model="anthropic/claude-sonnet-4-6",
        input=[{"role": "user", "content": "Think carefully: what is 17 * 23?"}],
        reasoning_effort="low",                          # string ← works
    )
    print("string form:", r1.usage.output_tokens_details)
    # → reasoning_tokens > 0, "reasoning" item in r1.output

    r2 = await litellm.aresponses(
        model="anthropic/claude-sonnet-4-6",
        input=[{"role": "user", "content": "Think carefully: what is 17 * 23?"}],
        reasoning=litellm.Reasoning(effort="low", summary="concise"),  # dict ← broken
    )
    print("Reasoning(summary) form:", r2.usage.output_tokens_details)
    # → reasoning_tokens == 0, no "reasoning" item in r2.output

asyncio.run(main())

Output on v1.85.0:

string form:                 OutputTokensDetails(reasoning_tokens=187, ...)
Reasoning(summary) form:     OutputTokensDetails(reasoning_tokens=0, ...)

Suggested fix

In litellm/llms/anthropic/chat/transformation.py, make Anthropic's reasoning_effort handler tolerant of dict input — same shape-permissiveness as OpenAI:

-            elif param == "reasoning_effort" and isinstance(value, str):
+            elif param == "reasoning_effort":
+                effort_value = value
+                if isinstance(effort_value, dict):
+                    effort_value = effort_value.get("effort")
+                if not isinstance(effort_value, str):
+                    continue
                 mapped_thinking = AnthropicConfig._map_reasoning_effort(
-                    reasoning_effort=value,
+                    reasoning_effort=effort_value,
                     model=model,
                     llm_provider=self.custom_llm_provider or "anthropic",
                 )
                 ...
                     if AnthropicConfig._is_adaptive_thinking_model(model):
                         mapped_effort = REASONING_EFFORT_TO_OUTPUT_CONFIG_EFFORT.get(
-                            value
+                            effort_value
                         )

Same one-line shape coercion that gpt_5_transformation already does. No downstream changes needed; summary is irrelevant for Anthropic's thinking_blocks anyway.

Impact

Anyone calling litellm.aresponses against Claude with the OpenAI-shaped Reasoning(effort, summary) object (the standard shape for OpenAI's Responses API) silently loses thinking on Claude after upgrading from 1.83.x → 1.85.0. Trace-level symptom: output_tokens_details.reasoning_tokens drops to 0 even though reasoning_effort was set.

Other transformers may need the same audit (Bedrock, Vertex, Databricks — they share AnthropicConfig._is_adaptive_thinking_model but I didn't verify their reasoning_effort handlers).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix Anthropic reasoning_effort silently dropped when passed as Reasoning(effort, summary) dict (regression in v1.85.0) [2 pull requests]