vllm - ✅(Solved) Fix MiniMaxM2ReasoningParser broken for M2.5: extract_reasoning_streaming assumes no <think> start tag [1 pull requests, 5 comments, 5 participants]

SilviuSavu · 2026-03-26T08:43:36Z

[vllm] MiniMaxM2ReasoningParser overrides extract reasoning streaming with logic that assumes the model only generates ` (no opening ` ` tag). This was true for the original M2, but **M2.5 generates both ` ` and ` `**. The result: reasoning content (including raw ` ... ` tags) leaks into the `content` field, and `reasoning` / `reasoning_content` is always `null` — even with `include_reasoning: true`. The base class `BaseThinkingReasoningParser` already handles both tags correctly. The override in `MiniMaxM2ReasoningParser` is the sole cause of the bug. # PR #38213: fix(reasoning): MiniMaxM2ReasoningParser broken for M2.5 - Repository: vllm-project/vllm - Author: SilviuSavu - State: open | merged: False - Link: https://github.com/vllm-project/vllm/pull/38213 ## Description (problem / solution / changelog) ## Summary - Remove broken `extract_reasoning_streaming` override from `MiniMaxM2ReasoningParser` - M2.5 generates both ` ` and ` `, but the override assumed only ` ` (M2 behavior) - The base class `BaseThinkingReasoningParser` already handles both tags correctly ## Problem With `--reasoning-parser minimax_m2`, reasoning content (including raw ` ... ` tags) leaks into the `content` field. The `reasoning` field is always `null`, even with `include_reasoning: true`. Root cause: `MiniMaxM2ReasoningParser.extract_reasoning_streaming` treats everything as reasoning until ` `, never checking for or stripping ` `. The ` ` tag gets included in the reasoning text, and downstream the entire block ends up in `content`. ## Fix Delete the override. `BaseThinkingReasoningParser.extract_reasoning_streaming` checks for `start_token_id` in previous/delta tokens, handles both tags, and correctly splits reasoning from content. This also maintains backward compatibility with M2 (which only generates ` `) since the base class handles that case too. ## Test plan - [x] Verified `extract_reasoning` correctly returns `reasoning='thinking'`, `content='\n\nanswer'` for input ` thinking \n\nanswer` - [x] Verified end-to-end with vLLM v0.17.1rc1.dev150 serving MiniMax-M2.5-REAP-172B: `reasoning` field populated, `content` clean - [ ] Existing tests continue to pass Fixes #38212 🤖 Generated with [Claude Code](https://claude.com/claude-code) ## Changed files - `vllm/reasoning/minimax_m2_reasoning_parser.py` (modified, +4/-43) ## Workaround Use `--reasoning-parser deepseek_r1` instead, which handles both tags correctly. ## Bug: `MiniMaxM2ReasoningParser` doesn't handle M2.5's ` ` start tag ### Description `MiniMaxM2ReasoningParser` overrides `extract_reasoning_streaming` with logic that assumes the model only generates ` ` (no opening ` ` tag). This was true for the original M2, but **M2.5 generates both ` ` and ` `**. The result: reasoning content (including raw ` ... ` tags) leaks into the `content` field, and `reasoning` / `reasoning_content` is always `null` — even with `include_reasoning: true`. The base class `BaseThinkingReasoningParser` already handles both tags correctly. The override in `MiniMaxM2ReasoningParser` is the sole cause of the bug. ### How to reproduce ```bash vllm serve MiniMaxAI/MiniMax-M2.5 \ --enable-auto-tool-choice \ --tool-call-parser minimax_m2 \ --reasoning-parser minimax_m2 ``` ```python # reasoning is None, tags leak into content response = client.chat.completions.create( model="MiniMaxAI/MiniMax-M2.5", messages=[{"role": "user", "content": "What is 2+2?"}], extra_body={"include_reasoning": True}, ) print(response.choices[0].message.reasoning) # None (should have content) print(response.choices[0].message.content) # " ... \n\n4" ``` ### Root cause [`MiniMaxM2ReasoningParser.extract_reasoning_streaming`](https://github.com/vllm-project/vllm/blob/main/vllm/reasoning/minimax_m2_reasoning_parser.py) overrides the base class with M2-specific logic: - It treats all content as reasoning until ` ` - It never checks for or strips the ` ` start tag - This causes the ` ` tag itself to be included in the reasoning text, and downstream the entire block ends up in `content` The base class `BaseThinkingReasoningParser.extract_reasoning_streaming` already handles both ` ` and ` ` correctly (checks for `start_token_id` in previous/delta tokens, strips both tags). ### Proposed fix Remove the `extract_reasoning_streaming` override from `MiniMaxM2ReasoningParser` so it inherits the correct implementation from `BaseThinkingReasoningParser`. ### Related - #34625 — users report `minimax_m2_append_think` doesn't separate reasoning for M2.5, workaround is `deepseek_r1` - The `minimax_m2_append_think` parser has a separate but related issue: its `extract_reasoning` inte

vllm2026-03-26 08:43:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#38212•Fetched 2026-04-08 01:31:36

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×5cross-referenced ×1referenced ×1

MiniMaxM2ReasoningParser overrides extract_reasoning_streaming with logic that assumes the model only generates </think> (no opening <think> tag). This was true for the original M2, but M2.5 generates both <think> and </think>.

The result: reasoning content (including raw <think>...</think> tags) leaks into the content field, and reasoning / reasoning_content is always null — even with include_reasoning: true.

The base class BaseThinkingReasoningParser already handles both tags correctly. The override in MiniMaxM2ReasoningParser is the sole cause of the bug.

Root Cause

MiniMaxM2ReasoningParser.extract_reasoning_streaming overrides the base class with M2-specific logic:

It treats all content as reasoning until </think>
It never checks for or strips the <think> start tag
This causes the <think> tag itself to be included in the reasoning text, and downstream the entire block ends up in content

The base class BaseThinkingReasoningParser.extract_reasoning_streaming already handles both <think> and </think> correctly (checks for start_token_id in previous/delta tokens, strips both tags).

Fix Action

Workaround

Use --reasoning-parser deepseek_r1 instead, which handles both tags correctly.

PR fix notes

PR #38213: fix(reasoning): MiniMaxM2ReasoningParser broken for M2.5

Repository: vllm-project/vllm
Author: SilviuSavu
State: open | merged: False
Link: https://github.com/vllm-project/vllm/pull/38213

Description (problem / solution / changelog)

Summary

Remove broken extract_reasoning_streaming override from MiniMaxM2ReasoningParser
M2.5 generates both <think> and </think>, but the override assumed only </think> (M2 behavior)
The base class BaseThinkingReasoningParser already handles both tags correctly

Problem

With --reasoning-parser minimax_m2, reasoning content (including raw <think>...</think> tags) leaks into the content field. The reasoning field is always null, even with include_reasoning: true.

Root cause: MiniMaxM2ReasoningParser.extract_reasoning_streaming treats everything as reasoning until </think>, never checking for or stripping <think>. The <think> tag gets included in the reasoning text, and downstream the entire block ends up in content.

Fix

Delete the override. BaseThinkingReasoningParser.extract_reasoning_streaming checks for start_token_id in previous/delta tokens, handles both tags, and correctly splits reasoning from content. This also maintains backward compatibility with M2 (which only generates </think>) since the base class handles that case too.

Test plan

Verified extract_reasoning correctly returns reasoning='thinking', content='\n\nanswer' for input <think>thinking</think>\n\nanswer
Verified end-to-end with vLLM v0.17.1rc1.dev150 serving MiniMax-M2.5-REAP-172B: reasoning field populated, content clean
Existing tests continue to pass

Fixes #38212

🤖 Generated with Claude Code

Changed files

vllm/reasoning/minimax_m2_reasoning_parser.py (modified, +4/-43)

Code Example

vllm serve MiniMaxAI/MiniMax-M2.5 \
  --enable-auto-tool-choice \
  --tool-call-parser minimax_m2 \
  --reasoning-parser minimax_m2

---

# reasoning is None, <think> tags leak into content
response = client.chat.completions.create(
    model="MiniMaxAI/MiniMax-M2.5",
    messages=[{"role": "user", "content": "What is 2+2?"}],
    extra_body={"include_reasoning": True},
)
print(response.choices[0].message.reasoning)  # None (should have content)
print(response.choices[0].message.content)    # "<think>...</think>\n\n4"

RAW_BUFFERClick to expand / collapse

Bug: `MiniMaxM2ReasoningParser` doesn't handle M2.5's `<think>` start tag

Description

The result: reasoning content (including raw <think>...</think> tags) leaks into the content field, and reasoning / reasoning_content is always null — even with include_reasoning: true.

The base class BaseThinkingReasoningParser already handles both tags correctly. The override in MiniMaxM2ReasoningParser is the sole cause of the bug.

How to reproduce

vllm serve MiniMaxAI/MiniMax-M2.5 \
  --enable-auto-tool-choice \
  --tool-call-parser minimax_m2 \
  --reasoning-parser minimax_m2

# reasoning is None, <think> tags leak into content
response = client.chat.completions.create(
    model="MiniMaxAI/MiniMax-M2.5",
    messages=[{"role": "user", "content": "What is 2+2?"}],
    extra_body={"include_reasoning": True},
)
print(response.choices[0].message.reasoning)  # None (should have content)
print(response.choices[0].message.content)    # "<think>...</think>\n\n4"

Root cause

MiniMaxM2ReasoningParser.extract_reasoning_streaming overrides the base class with M2-specific logic:

It treats all content as reasoning until </think>
It never checks for or strips the <think> start tag
This causes the <think> tag itself to be included in the reasoning text, and downstream the entire block ends up in content

Proposed fix

Remove the extract_reasoning_streaming override from MiniMaxM2ReasoningParser so it inherits the correct implementation from BaseThinkingReasoningParser.

#34625 — users report minimax_m2_append_think doesn't separate reasoning for M2.5, workaround is deepseek_r1
The minimax_m2_append_think parser has a separate but related issue: its extract_reasoning intentionally returns (None, "<think>" + model_output), making reasoning separation impossible by design

Environment

vLLM: v0.17.1rc1.dev150
Model: MiniMax-M2.5 (also reproduced with REAP-172B quantized variant)

Workaround

Use --reasoning-parser deepseek_r1 instead, which handles both tags correctly.

extent analysis

Fix Plan

To resolve the issue, we need to remove the extract_reasoning_streaming override from MiniMaxM2ReasoningParser. This will allow it to inherit the correct implementation from BaseThinkingReasoningParser, which handles both <think> and </think> tags correctly.

Steps:

Remove the extract_reasoning_streaming method from MiniMaxM2ReasoningParser.
Ensure that MiniMaxM2ReasoningParser inherits from BaseThinkingReasoningParser and does not override the extract_reasoning_streaming method.

Example Code:

# Remove the override
class MiniMaxM2ReasoningParser(BaseThinkingReasoningParser):
    # Remove the extract_reasoning_streaming method
    pass

Alternatively, if you want to keep the method for future customization, you can call the parent class's method:

class MiniMaxM2ReasoningParser(BaseThinkingReasoningParser):
    def extract_reasoning_streaming(self, *args, **kwargs):
        return super().extract_reasoning_streaming(*args, **kwargs)

Verification

To verify that the fix worked, you can run the following test:

response = client.chat.completions.create(
    model="MiniMaxAI/MiniMax-M2.5",
    messages=[{"role": "user", "content": "What is 2+2?"}],
    extra_body={"include_reasoning": True},
)
print(response.choices[0].message.reasoning)  # Should have content
print(response.choices[0].message.content)    # Should not have <think> tags

Extra Tips

Make sure to test the fix with different models and inputs to ensure that it works as expected.
Consider adding tests to prevent similar issues in the future.
If you encounter any issues with the deepseek_r1 parser, you can try using the minimax_m2_append_think parser as a workaround. However, note that this parser has a separate issue that makes reasoning separation impossible by design.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#conversation history #tool integration #LLM response #prompt template #agent execution

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

vllm - ✅(Solved) Fix MiniMaxM2ReasoningParser broken for M2.5: extract_reasoning_streaming assumes no <think> start tag [1 pull requests, 5 comments, 5 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

PR fix notes

PR #38213: fix(reasoning): MiniMaxM2ReasoningParser broken for M2.5

Description (problem / solution / changelog)

Summary

Problem

Fix

Test plan

Changed files

Code Example

Bug: MiniMaxM2ReasoningParser doesn't handle M2.5's <think> start tag

Description

How to reproduce

Root cause

Proposed fix

Related

Environment

Workaround

extent analysis

Fix Plan

Steps:

Example Code:

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Bug: `MiniMaxM2ReasoningParser` doesn't handle M2.5's `<think>` start tag