litellm - ✅(Solved) Fix [Bug]: reasoning_content not parsed from raw <think> tags for nvidia_nim/minimax-m2.5 via LiteLLM Proxy [2 pull requests, 1 participants]

litellm2026-03-20 22:10:01

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24253•Fetched 2026-04-08 01:09:00

View on GitHub

Comments

Participants

Timeline

Reactions

Author

vcliment89

Participants

vcliment89

Timeline (top)

labeled ×3cross-referenced ×2referenced ×2

Fix Action

Fixed

Fixed by PR: fix(nvidia_nim): extract <think> reasoning blocks from content in transform_response (https://github.com/BerriAI/litellm/pull/24276)
Fixed by PR: fix: [Bug]: reasoning_content not parsed from raw <think> tags for nvidia_nim/minimax-m2.5 via LiteLLM Proxy (https://github.com/BerriAI/litellm/pull/24280)

PR fix notes

PR #24276: fix(nvidia_nim): extract <think> reasoning blocks from content in transform_response

Repository: BerriAI/litellm
Author: NIK-TIGER-BILL
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/24276

Description (problem / solution / changelog)

Summary

Fixes #24253

Problem

NVIDIA NIM reasoning models (e.g. minimax/minimax-m1) return their chain-of-thought inside a raw <think>…</think> block in choices[0].message.content. The expected behaviour is for litellm to split this into reasoning_content + content, as it does for DeepSeek R1 and other reasoning providers. Instead, reasoning_content is null and the raw tags appear in content.

Root cause

The litellm response pipeline already calls _parse_content_for_reasoning() via _extract_reasoning_content() in convert_to_model_response_object(). However, in the proxy path with a thinking: true / merge_reasoning_content_in_choices: false model config, NvidiaNimConfig inherits OpenAIGPTConfig.transform_response() which routes through convert_to_model_response_object() — but that function only invokes _extract_reasoning_content() when the choice message dictionary has a string content. A subtle ordering issue with how the proxy wraps the response means the extraction step can be skipped.

Fix

Override transform_response() in NvidiaNimConfig to add an explicit post-processing pass after the parent class builds the ModelResponse: if reasoning_content is still None and the message content contains <think>…</think>, extract the reasoning using the shared _parse_content_for_reasoning() helper.

After fix

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "\n\npong",
      "reasoning_content": "The user just sent ping..."
    }
  }]
}

Changed files

litellm/llms/nvidia_nim/chat/transformation.py (modified, +71/-2)
litellm/llms/openrouter/chat/transformation.py (modified, +6/-0)

PR #24280: fix: [Bug]: reasoning_content not parsed from raw <think> tags for nvidia_nim/minimax-m2.5 via LiteLLM Proxy

Repository: BerriAI/litellm
Author: RoyVivat
State: closed | merged: False
Link: https://github.com/BerriAI/litellm/pull/24280

Description (problem / solution / changelog)

Summary

Fixes #24253

Changes

Test plan

Unit tests added
make test-unit passes
make lint passes

Changed files

litellm/litellm_core_utils/prompt_templates/common_utils.py (modified, +2/-2)
litellm/litellm_core_utils/streaming_handler.py (modified, +65/-0)
tests/llm_translation/test_nvidia_nim.py (modified, +234/-0)

Code Example

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "<think>The user just sent ping...</think>\n\npong",
      "reasoning_content": null
    }
  }]
}

---

model_list:
  - model_name: my-model
    litellm_params:
      model: nvidia_nim/minimax/minimax-m1
      api_base: https://integrate.api.nvidia.com/v1
      api_key: os.environ/NVIDIA_API_KEY
      thinking: true
      merge_reasoning_content_in_choices: false

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When using nvidia_nim as the provider with the minimax-m1 model, the <think>...</think> block is returned as raw text inside content instead of being parsed into reasoning_content. Setting thinking: true and merge_reasoning_content_in_choices: false in the model config has no effect. Expected behavior reasoning_content should be populated and content should contain only the final answer, consistent with how other reasoning models behave (e.g. DeepSeek R1 via OpenRouter). Actual behavior

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "<think>The user just sent ping...</think>\n\npong",
      "reasoning_content": null
    }
  }]
}

Config

model_list:
  - model_name: my-model
    litellm_params:
      model: nvidia_nim/minimax/minimax-m1
      api_base: https://integrate.api.nvidia.com/v1
      api_key: os.environ/NVIDIA_API_KEY
      thinking: true
      merge_reasoning_content_in_choices: false

Steps to Reproduce

Configure proxy with the above config
Send any message to the model
Observe <think> block in raw content instead of reasoning_content

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.3

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To resolve the issue, we need to modify the litellm_params configuration to correctly parse the <think> block into reasoning_content.

Update the model_list configuration in the YAML file:

model_list:
  - model_name: my-model
    litellm_params:
      model: nvidia_nim/minimax/minimax-m1
      api_base: https://integrate.api.nvidia.com/v1
      api_key: os.environ/NVIDIA_API_KEY
      thinking: true
      merge_reasoning_content_in_choices: true  # Change this to true
      parse_think_blocks: true  # Add this parameter

If the parse_think_blocks parameter is not available, you may need to modify the LiteLLM code to handle this case. For example:

# In the LiteLLM codebase, find the function that processes the model response
def process_response(response):
    # ...
    if 'content' in response and '<think>' in response['content']:
        # Parse the <think> block and move it to reasoning_content
        think_block = response['content'].split('<think>')[1].split('</think>')[0]
        response['reasoning_content'] = think_block
        response['content'] = response['content'].replace('<think>' + think_block + '</think>', '')
    # ...

Verification

After applying the fix, send a message to the model and verify that the <think> block is correctly parsed into reasoning_content and removed from the content. The expected response should be:

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "pong",
      "reasoning_content": "The user just sent ping..."
    }
  }]
}

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #environment variable #network issue #logging issue #authentication issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: reasoning_content not parsed from raw <think> tags for nvidia_nim/minimax-m2.5 via LiteLLM Proxy [2 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #24276: fix(nvidia_nim): extract <think> reasoning blocks from content in transform_response

Description (problem / solution / changelog)

Summary

Problem

Root cause

Fix

After fix

Changed files

PR #24280: fix: [Bug]: reasoning_content not parsed from raw <think> tags for nvidia_nim/minimax-m2.5 via LiteLLM Proxy

Description (problem / solution / changelog)

Summary

Changes

Test plan

Changed files

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

Fix Plan

Verification

Still need to ship something?

RELATED_DISCOVERY

TRENDING