litellm - 💡(How to fix) Fix Anthropic → OpenAI conversion drops reasoning_content, breaks multi-turn with reasoning models

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When LiteLLM converts Anthropic /v1/messages assistant responses (with thinking blocks) to OpenAI Chat Completions format, the thinking blocks are stored in a custom thinking_blocks field. The standard reasoning_content field is not set.

This causes multi-turn requests to reasoning models to fail with:

The `reasoning_content` in the thinking mode must be passed back to the API.

Root Cause

In litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py:

if len(thinking_blocks) > 0:
    assistant_message["thinking_blocks"] = thinking_blocks

The upstream API (DeepSeek, OpenAI o-series) expects reasoning_content at the top level of the assistant message dict in multi-turn conversations.

Fix Action

Fix

if len(thinking_blocks) > 0:
    assistant_message["thinking_blocks"] = thinking_blocks
    first_thinking = thinking_blocks[0]
    assistant_message["reasoning_content"] = first_thinking.get("thinking", "")

Code Example

The `reasoning_content` in the thinking mode must be passed back to the API.

---

if len(thinking_blocks) > 0:
    assistant_message["thinking_blocks"] = thinking_blocks

---

if len(thinking_blocks) > 0:
    assistant_message["thinking_blocks"] = thinking_blocks
    first_thinking = thinking_blocks[0]
    assistant_message["reasoning_content"] = first_thinking.get("thinking", "")
RAW_BUFFERClick to expand / collapse

Description

When LiteLLM converts Anthropic /v1/messages assistant responses (with thinking blocks) to OpenAI Chat Completions format, the thinking blocks are stored in a custom thinking_blocks field. The standard reasoning_content field is not set.

This causes multi-turn requests to reasoning models to fail with:

The `reasoning_content` in the thinking mode must be passed back to the API.

Root cause

In litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py:

if len(thinking_blocks) > 0:
    assistant_message["thinking_blocks"] = thinking_blocks

The upstream API (DeepSeek, OpenAI o-series) expects reasoning_content at the top level of the assistant message dict in multi-turn conversations.

Fix

if len(thinking_blocks) > 0:
    assistant_message["thinking_blocks"] = thinking_blocks
    first_thinking = thinking_blocks[0]
    assistant_message["reasoning_content"] = first_thinking.get("thinking", "")

Steps to reproduce

  1. LiteLLM proxy with Anthropic /v1/messages + LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES=true
  2. Model: deepseek/* pointing to a reasoning endpoint via api_base
  3. Client sends a multi-turn conversation
  4. First turn succeeds, second turn fails

Environment

  • LiteLLM: main (commit 867470f)
  • Model: DeepSeek reasoning model via OpenAI-compatible proxy

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix Anthropic → OpenAI conversion drops reasoning_content, breaks multi-turn with reasoning models