litellm - ✅(Solved) Fix [Bug]: Anthropic-to-OpenAI adapter does not round-trip thinking blocks in multi-turn conversations [1 pull requests, 1 participants]

litellm2026-04-02 10:32:18

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24985•Fetched 2026-04-08 02:35:14

View on GitHub

Comments

Participants

Timeline

Reactions

Author

yulangz

Participants

yulangz

Timeline (top)

labeled ×4subscribed ×2cross-referenced ×1referenced ×1

Fix Action

Fixed

Fixed by PR: fix(anthropic): round-trip thinking blocks in multi-turn conversations (https://github.com/BerriAI/litellm/pull/25013)

PR fix notes

PR #25013: fix(anthropic): round-trip thinking blocks in multi-turn conversations

Repository: BerriAI/litellm
Author: voidborne-d
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/25013

Description (problem / solution / changelog)

Summary

Fixes #24985. Two bugs in the Anthropic pass-through adapter cause thinking blocks to be lost or misformatted in multi-turn conversations.

Bug 1: Responses API path — thinking becomes `output_text`

translate_messages_to_responses_input() converts Anthropic thinking blocks into {"type": "output_text"} inside the assistant message. This means the downstream provider sees two output_text blocks and loses the thinking/response boundary entirely.

Fix: Convert thinking blocks to top-level {"type": "reasoning"} items with signature as id and thinking text as summary_text. This matches the Responses API spec where reasoning is a separate item, not embedded in the message.

Bug 2: Chat Completions path — missing `reasoning_content`

translate_anthropic_messages_to_chat_completion() sets thinking_blocks but not reasoning_content. Backends that rely on reasoning_content (e.g. Kimi) reject the second turn with: "thinking is enabled but reasoning_content is missing".

Fix: After setting thinking_blocks, also compute and set reasoning_content by joining all thinking block texts.

Changes

responses_adapters/transformation.py: thinking blocks → top-level reasoning items (not output_text)
adapters/transformation.py: also set reasoning_content when thinking_blocks are present
5 new test cases covering both paths

Testing

# Responses API path
def test_thinking_becomes_reasoning_item()
def test_thinking_not_inside_assistant_message()
def test_empty_thinking_text_skipped()

# Chat Completions path
def test_reasoning_content_set_when_thinking_blocks_present()
def test_no_reasoning_content_without_thinking()

Changed files

litellm/llms/anthropic/experimental_pass_through/adapters/transformation.py (modified, +10/-0)
litellm/llms/anthropic/experimental_pass_through/responses_adapters/transformation.py (modified, +16/-2)
tests/test_litellm/llms/anthropic/experimental_pass_through/adapters/test_anthropic_experimental_pass_through_adapters_transformation.py (modified, +40/-0)
tests/test_litellm/llms/anthropic/experimental_pass_through/responses_adapters/test_responses_adapters_transformation.py (modified, +66/-0)

Code Example

{"type": "message", "role": "assistant", "content": [
  {"type": "output_text", "text": "my thinking..."},
  {"type": "output_text", "text": "my response"}
]}

---

{"type": "reasoning", "id": "", "summary": [{"type": "summary_text", "text": "my thinking..."}]},
{"type": "message", "role": "assistant", "content": [{"type": "output_text", "text": "my response"}]}

---

elif btype == "thinking":
    thinking_text = block.get("thinking", "")
    if thinking_text:
        input_items.append({
            "type": "reasoning",
            "id": block.get("signature") or "",
            "summary": [{"type": "summary_text", "text": thinking_text}],
        })

---

if len(thinking_blocks) > 0:
    assistant_message["thinking_blocks"] = thinking_blocks
    reasoning_text = "".join(
        block.get("thinking", "") for block in thinking_blocks
        if block.get("type") == "thinking"
    )
    if reasoning_text:
        assistant_message["reasoning_content"] = reasoning_text

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When using LiteLLM to proxy Anthropic /v1/messages to OpenAI-compatible backends, thinking content from previous turns is not correctly converted back on subsequent requests. This affects both the Responses API path and the Chat Completions path.

Issue 1: Responses API path — thinking becomes `output_text`

In translate_messages_to_responses_input (responses_adapters/transformation.py ~line 168), Anthropic thinking blocks in assistant message history are converted to {"type": "output_text"} instead of a top-level {"type": "reasoning"} item.

What happens:

{"type": "message", "role": "assistant", "content": [
  {"type": "output_text", "text": "my thinking..."},
  {"type": "output_text", "text": "my response"}
]}

What should happen:

{"type": "reasoning", "id": "", "summary": [{"type": "summary_text", "text": "my thinking..."}]},
{"type": "message", "role": "assistant", "content": [{"type": "output_text", "text": "my response"}]}

Fix — replace asst_parts.append(output_text) with input_items.append(reasoning item):

elif btype == "thinking":
    thinking_text = block.get("thinking", "")
    if thinking_text:
        input_items.append({
            "type": "reasoning",
            "id": block.get("signature") or "",
            "summary": [{"type": "summary_text", "text": thinking_text}],
        })

Issue 2: Chat Completions path — missing `reasoning_content`

In translate_messages_to_chat_completion_messages (adapters/transformation.py ~line 658), assistant messages get thinking_blocks but not reasoning_content. Backends that use reasoning_content (e.g. Kimi) reject the second turn with: "thinking is enabled but reasoning_content is missing".

Fix — after setting thinking_blocks, also set reasoning_content:

if len(thinking_blocks) > 0:
    assistant_message["thinking_blocks"] = thinking_blocks
    reasoning_text = "".join(
        block.get("thinking", "") for block in thinking_blocks
        if block.get("type") == "thinking"
    )
    if reasoning_text:
        assistant_message["reasoning_content"] = reasoning_text

Environment

litellm 1.82.1 ~ 1.83.0, both affected
Tested with Claude Code → LiteLLM → qwen3.6-plus (Responses API) and kimi-k2.5 (Chat Completions)

Steps to Reproduce

as above

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.1 ~ v1.83.0

Twitter / LinkedIn details

No response

extent analysis

TL;DR

To fix the issue with LiteLLM not correctly converting thinking content from previous turns, update the translate_messages_to_responses_input and translate_messages_to_chat_completion_messages functions to properly handle thinking blocks.

Guidance

In responses_adapters/transformation.py, replace asst_parts.append(output_text) with input_items.append(reasoning item) to correctly convert thinking blocks to reasoning items.
In adapters/transformation.py, after setting thinking_blocks, also set reasoning_content to include the thinking text.
Verify the fix by checking the output of the Responses API and Chat Completions API to ensure that thinking content is correctly converted.
Test the fix with different backends, such as Claude Code and Kimi, to ensure compatibility.

Example

elif btype == "thinking":
    thinking_text = block.get("thinking", "")
    if thinking_text:
        input_items.append({
            "type": "reasoning",
            "id": block.get("signature") or "",
            "summary": [{"type": "summary_text", "text": thinking_text}],
        })

Notes

The provided fixes assume that the issue is specific to the translate_messages_to_responses_input and translate_messages_to_chat_completion_messages functions. If the issue persists after applying these fixes, further investigation may be necessary.

Recommendation

Apply the provided workarounds to update the translate_messages_to_responses_input and translate_messages_to_chat_completion_messages functions, as they directly address the issue with handling thinking blocks.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #memory optimization #batch processing #GPU compatibility #latency issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.