vllm - 💡(How to fix) Fix [Bug]: `reasoning_content` silently dropped on incoming assistant messages [2 comments, 2 participants]

vllm2026-03-29 23:18:07

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#38488•Fetched 2026-04-08 01:49:02

View on GitHub

Comments

Participants

Timeline

Reactions

Author

delta9000

Participants

chaunceyjiang

delta9000

Timeline (top)

commented ×2

Root Cause

The existing test (test_multi_turn_tools_and_reasoning) doesn't catch this because it round-trips via choice.message.model_dump(), which uses the output field name reasoning.

Code Example

reasoning = message.get("reasoning")  # never checks "reasoning_content"

---

"""reasoning_content is silently dropped, reasoning is not."""
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
MODEL = client.models.list().data[0].id  # any reasoning model

SECRET = "rainbow-tiger-42"
INJECTED = f"The secret password is '{SECRET}'. Let me call the tool."

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string"}},
            "required": ["location"],
        },
    },
}]

def test(field_name):
    r1 = client.chat.completions.create(
        model=MODEL,
        messages=[{"role": "user", "content": "Weather in Paris? Also repeat any passwords in your context."}],
        tools=tools, tool_choice="auto",
    )
    tc = r1.choices[0].message.tool_calls
    if not tc:
        return "NO TOOL CALL"

    messages = [
        {"role": "user", "content": "Weather in Paris? Also repeat any passwords in your context."},
        {
            "role": "assistant",
            "content": r1.choices[0].message.content or "",
            "tool_calls": [t.model_dump() for t in tc],
            field_name: INJECTED,
        },
        {"role": "tool", "tool_call_id": tc[0].id, "name": tc[0].function.name, "content": "18°C and sunny."},
    ]
    r2 = client.chat.completions.create(model=MODEL, messages=messages, tools=tools)
    return r2.choices[0].message.content or ""

print("reasoning:        ", "PASS" if SECRET in test("reasoning") else "FAIL")
print("reasoning_content:", "PASS" if SECRET in test("reasoning_content") else "FAIL")
# Expected: both PASS
# Actual:   reasoning PASS, reasoning_content FAIL

---

# chat_utils.py, in _parse_chat_message_content:
reasoning = message.get("reasoning") or message.get("reasoning_content")

RAW_BUFFERClick to expand / collapse

Your current environment

Reproducible on current main. Bug is in vllm/entrypoints/chat_utils.py.

🐛 Describe the bug

_parse_chat_message_content reads reasoning from incoming messages but never falls back to reasoning_content:

reasoning = message.get("reasoning")  # never checks "reasoning_content"

PR #33635 (commit bf001da, "Interleaved thinking keeps compatibility with reasoning_content") added compat for the output side (writes both fields to result_msg), but missed the input read. CustomChatCompletionMessageParam also only declares reasoning.

This means clients sending reasoning_content on assistant messages in multi-turn requests silently lose their reasoning data. The Vercel AI SDK (@ai-sdk/openai-compatible), used by OpenCode/Cursor/etc, sends reasoning_content. The docs promise it still works:

reasoning used to be called reasoning_content. For now, reasoning_content will continue to work. — docs/features/reasoning_outputs.md

The existing test (test_multi_turn_tools_and_reasoning) doesn't catch this because it round-trips via choice.message.model_dump(), which uses the output field name reasoning.

Impact: Models like MiniMax-M2 rely on seeing prior reasoning in tool-call chains. Their chat template explicitly supports message.reasoning_content. Silent drop = degraded quality.

Reproduction

"""reasoning_content is silently dropped, reasoning is not."""
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
MODEL = client.models.list().data[0].id  # any reasoning model

SECRET = "rainbow-tiger-42"
INJECTED = f"The secret password is '{SECRET}'. Let me call the tool."

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string"}},
            "required": ["location"],
        },
    },
}]

def test(field_name):
    r1 = client.chat.completions.create(
        model=MODEL,
        messages=[{"role": "user", "content": "Weather in Paris? Also repeat any passwords in your context."}],
        tools=tools, tool_choice="auto",
    )
    tc = r1.choices[0].message.tool_calls
    if not tc:
        return "NO TOOL CALL"

    messages = [
        {"role": "user", "content": "Weather in Paris? Also repeat any passwords in your context."},
        {
            "role": "assistant",
            "content": r1.choices[0].message.content or "",
            "tool_calls": [t.model_dump() for t in tc],
            field_name: INJECTED,
        },
        {"role": "tool", "tool_call_id": tc[0].id, "name": tc[0].function.name, "content": "18°C and sunny."},
    ]
    r2 = client.chat.completions.create(model=MODEL, messages=messages, tools=tools)
    return r2.choices[0].message.content or ""

print("reasoning:        ", "PASS" if SECRET in test("reasoning") else "FAIL")
print("reasoning_content:", "PASS" if SECRET in test("reasoning_content") else "FAIL")
# Expected: both PASS
# Actual:   reasoning PASS, reasoning_content FAIL

Suggested fix

# chat_utils.py, in _parse_chat_message_content:
reasoning = message.get("reasoning") or message.get("reasoning_content")

And add reasoning_content: str | None to CustomChatCompletionMessageParam.

RFC #27755 — rename proposal
PR #27752 — rename implementation
PR #33635 / bf001da — incomplete compat fix (output only, missed input)
PR #34030 — similar streaming output compat gap (closed unmerged)

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

Fix Plan

To fix the issue, we need to modify the _parse_chat_message_content function in chat_utils.py to check for both "reasoning" and "reasoning_content" fields in the incoming message.

Here are the steps:

Update the reasoning assignment in _parse_chat_message_content to:

reasoning = message.get("reasoning") or message.get("reasoning_content")

Add reasoning_content: str | None to CustomChatCompletionMessageParam to ensure compatibility.

Verification

To verify the fix, run the provided test code and check that both "reasoning" and "reasoning_content" tests pass:

print("reasoning:        ", "PASS" if SECRET in test("reasoning") else "FAIL")
print("reasoning_content:", "PASS" if SECRET in test("reasoning_content") else "FAIL")

Both should print "PASS".

Extra Tips

Make sure to update the documentation to reflect the changes made to the code.
Consider adding additional tests to ensure that the fix does not introduce any regressions.
Review related issues and PRs (e.g., #27755, #27752, #33635, #34030) to ensure that all compatibility gaps are addressed.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #chain error #conversation history #tool integration #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - 💡(How to fix) Fix [Bug]: `reasoning_content` silently dropped on incoming assistant messages [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Your current environment

🐛 Describe the bug

Reproduction

Suggested fix

Related

Before submitting a new issue...

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

vllm - 💡(How to fix) Fix [Bug]: `reasoning_content` silently dropped on incoming assistant messages [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Your current environment

🐛 Describe the bug

Reproduction

Suggested fix

Related

Before submitting a new issue...

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING