litellm - 💡(How to fix) Fix [Bug]: DeepSeek V4 Pro (deepseek-v4-pro) fails in multi-turn conversations - reasoning_content stripped from assistant messages [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#26395Fetched 2026-04-24 10:36:31
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
1
Author
Participants
Timeline (top)
labeled ×1

Error Message

litellm.BadRequestError: DeepseekException - {"error":{"message":"The reasoning_content in the thinking mode must be passed back to the API.","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

Root Cause

DeepSeek V4 Pro enables thinking mode by default and has a new API requirement compared to older DeepSeek reasoning models:

  • Old behavior (deepseek-reasoner / R1): reasoning_content must NOT be passed back in subsequent turns — passing it causes a 400 error.
  • New behavior (deepseek-v4-pro): reasoning_content MUST be passed back in subsequent turns — omitting it causes a 400 error.

LiteLLM currently handles the old R1 behavior correctly: in litellm/types/utils.py (around line 1224), the Message.__init__ method deletes reasoning_content when it is None during serialization. This means reasoning_content from the first-turn response is stripped before being sent as the assistant message in the second turn.

This worked for R1 but breaks V4 Pro, which requires the field to be preserved.

Code Example

litellm.BadRequestError: DeepseekException - {"error":{"message":"The `reasoning_content` in the thinking mode must be passed back to the API.","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

---

model_list:
  - model_name: deepseek-v4-pro
    litellm_params:
      model: deepseek/deepseek-v4-pro
      api_key: os.environ/DEEPSEEK_API_KEY

---

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{"role": "user", "content": "What is 1+1?"}]
  }'

---

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [
      {"role": "user", "content": "What is 1+1?"},
      {"role": "assistant", "content": "2"},
      {"role": "user", "content": "And 2+2?"}
    ]
  }'

---

litellm.BadRequestError: DeepseekException - {"error":{"message":"The `reasoning_content` in the thinking mode must be passed back to the API.","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}. Received Model Group=deepseek-v4-pro
RAW_BUFFERClick to expand / collapse

What Happened?

When using deepseek/deepseek-v4-pro (or the Anthropic-compatible endpoint anthropic/deepseek-v4-pro with api_base: https://api.deepseek.com/anthropic) through LiteLLM proxy, the first turn succeeds but every subsequent turn fails with:

litellm.BadRequestError: DeepseekException - {"error":{"message":"The `reasoning_content` in the thinking mode must be passed back to the API.","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

Expected behavior: Multi-turn conversations should work correctly with DeepSeek V4 Pro.

Root Cause Analysis

DeepSeek V4 Pro enables thinking mode by default and has a new API requirement compared to older DeepSeek reasoning models:

  • Old behavior (deepseek-reasoner / R1): reasoning_content must NOT be passed back in subsequent turns — passing it causes a 400 error.
  • New behavior (deepseek-v4-pro): reasoning_content MUST be passed back in subsequent turns — omitting it causes a 400 error.

LiteLLM currently handles the old R1 behavior correctly: in litellm/types/utils.py (around line 1224), the Message.__init__ method deletes reasoning_content when it is None during serialization. This means reasoning_content from the first-turn response is stripped before being sent as the assistant message in the second turn.

This worked for R1 but breaks V4 Pro, which requires the field to be preserved.

Steps to Reproduce

config.yaml:

model_list:
  - model_name: deepseek-v4-pro
    litellm_params:
      model: deepseek/deepseek-v4-pro
      api_key: os.environ/DEEPSEEK_API_KEY

curl — first turn (succeeds):

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{"role": "user", "content": "What is 1+1?"}]
  }'

curl — second turn (fails with 400):

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [
      {"role": "user", "content": "What is 1+1?"},
      {"role": "assistant", "content": "2"},
      {"role": "user", "content": "And 2+2?"}
    ]
  }'

The second turn fails because the assistant message is missing reasoning_content — LiteLLM strips it when serializing the response from turn 1.

Relevant Log Output

litellm.BadRequestError: DeepseekException - {"error":{"message":"The `reasoning_content` in the thinking mode must be passed back to the API.","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}. Received Model Group=deepseek-v4-pro

Suggested Fix

For deepseek-v4-pro (and potentially any DeepSeek model with thinking mode enabled), LiteLLM should:

  1. Preserve reasoning_content in the assistant message when building the conversation history for subsequent requests, OR
  2. Detect that the model requires thinking mode and automatically inject reasoning_content back into the messages array before forwarding to DeepSeek.

Component

  • Proxy
  • SDK (litellm Python package)

Version

  • LiteLLM: v1.82.3 (also reproduced conceptually against v1.83.12-nightly based on source code review)
  • DeepSeek V4 Pro API (released 2026-04-24)

extent analysis

TL;DR

To fix the issue, modify LiteLLM to preserve reasoning_content in the assistant message for subsequent requests when using DeepSeek V4 Pro.

Guidance

  • Identify the Message.__init__ method in litellm/types/utils.py and modify it to conditionally preserve reasoning_content based on the model type.
  • Consider adding a model-specific configuration option to control the preservation of reasoning_content.
  • Verify the fix by testing multi-turn conversations with DeepSeek V4 Pro and checking that the reasoning_content is correctly passed back to the API.
  • Review the LiteLLM documentation to ensure that the new behavior is properly documented and understood by users.

Example

# In litellm/types/utils.py, modify the Message.__init__ method
if model_name == "deepseek-v4-pro":
    # Preserve reasoning_content for DeepSeek V4 Pro
    self.reasoning_content = reasoning_content
else:
    # Remove reasoning_content for other models
    self.reasoning_content = None

Notes

The fix assumes that the model_name is available in the Message.__init__ method. If not, additional modifications may be necessary to determine the model type.

Recommendation

Apply the workaround by modifying the LiteLLM code to preserve reasoning_content for DeepSeek V4 Pro, as this is a specific requirement for this model and does not affect other models.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING