vllm - ✅(Solved) Fix [Bug]: forced tool_choice asserts when reasoning extraction returns content=None [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#40147Fetched 2026-04-18 05:52:20
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Participants
Timeline (top)
cross-referenced ×2referenced ×2

Forced tool choice can crash with AssertionError when a reasoning parser returns content=None.

This is reachable for parsers such as glm45 / DeepSeekV3ReasoningWithThinkingParser, whose full-output reasoning extraction returns (reasoning, None) when the model output contains only reasoning text or ends immediately after </think>.

Root Cause

Forced tool choice can crash with AssertionError when a reasoning parser returns content=None.

This is reachable for parsers such as glm45 / DeepSeekV3ReasoningWithThinkingParser, whose full-output reasoning extraction returns (reasoning, None) when the model output contains only reasoning text or ends immediately after </think>.

Fix Action

Fix / Workaround

I have a local patch with two regression tests that cover:

PR fix notes

PR #40148: fix(openai): tolerate empty content in forced tool choice

Description (problem / solution / changelog)

Summary

This fixes a forced-tool-choice crash when reasoning extraction returns content=None.

Some reasoning parsers, including the glm45 / DeepSeekV3ReasoningWithThinkingParser path, may consume the full model output and return (reasoning, None). The forced tool-choice branches in both chat-completions and responses parsing previously asserted that content is not None, which made this a server-side failure instead of a valid function call with empty arguments.

Changes

  • normalize content=None to "" in OpenAIServing._parse_tool_calls_from_content(...)
  • normalize content=None to "" in DelegatingParser._parse_tool_calls(...)
  • add regression tests for:
    • chat-completions named tool_choice with content=None
    • responses named tool_choice with content=None

Fixes #40147.

Testing

pytest -sv tests/entrypoints/openai/test_tool_choice_content_none.py

Changed files

  • tests/entrypoints/openai/test_tool_choice_content_none.py (added, +98/-0)
  • vllm/entrypoints/openai/engine/serving.py (modified, +2/-2)
  • vllm/parser/abstract_parser.py (modified, +2/-2)

PR #8400: [releases/v0.18.0][Platform][BugFix] Guard forced tool choice with empty content

Description (problem / solution / changelog)

What this PR does / why we need it?

This backports the forced-tool-choice content=None guard to the releases/v0.18.0 compatibility layer.

Upstream vLLM still has forced named tool-choice branches that assert content is not None after reasoning extraction. Some reasoning parsers can legally consume the full output and return (reasoning, None), which makes the assert reachable and can surface as a server-side failure.

This PR follows the same compatibility-patch pattern used by:

  • 7314bbe2 fix(platform): reimplement MiniMax usage accounting patch (#7835)
  • f83cb0e6 [Bugfix][Platform] Fix GLM47 tool-call finish backfill (#7710)

The patch is intentionally narrow:

  • normalize content=None to "" only for forced named tool choice
  • patch both chat-completions and responses parser entry points
  • keep the rest of upstream behavior unchanged

Upstream tracking:

  • issue: vllm-project/vllm#40147
  • PR: vllm-project/vllm#40148

Does this PR introduce any user-facing change?

Yes.

Forced named tool choice becomes robust when the reasoning parser returns no post-reasoning content, avoiding an internal assertion failure and emitting an empty-argument function call instead.

How was this patch tested?

Unit tests:

pytest -sv tests/ut/patch/platform/test_patch_tool_choice_none_content.py \
  tests/ut/patch/platform/test_patch_glm_tool_call_parser.py \
  tests/ut/patch/platform/test_patch_minimax_usage_accounting.py

Result: 22 passed.

Changed files

  • tests/ut/patch/platform/test_patch_tool_choice_none_content.py (added, +95/-0)
  • vllm_ascend/patch/__init__.py (modified, +20/-0)
  • vllm_ascend/patch/platform/__init__.py (modified, +1/-0)
  • vllm_ascend/patch/platform/patch_tool_choice_none_content.py (added, +86/-0)

Code Example

from vllm.entrypoints.openai.chat_completion.protocol import ChatCompletionRequest
from vllm.entrypoints.openai.engine.serving import OpenAIServing

req = ChatCompletionRequest.model_validate({
    "model": "test-model",
    "messages": [{"role": "user", "content": "test"}],
    "tools": [{
        "type": "function",
        "function": {
            "name": "get_weather",
            "parameters": {"type": "object", "properties": {}}
        }
    }],
    "tool_choice": {"type": "function", "function": {"name": "get_weather"}},
})

OpenAIServing._parse_tool_calls_from_content(
    request=req,
    tokenizer=None,
    enable_auto_tools=True,
    tool_parser_cls=None,
    content=None,
)

---

from vllm.entrypoints.openai.responses.protocol import ResponsesRequest
from vllm.parser.abstract_parser import DelegatingParser

# Any DelegatingParser instance reproduces this in _parse_tool_calls(...).
req = ResponsesRequest.model_validate({
    "model": "test-model",
    "input": "test",
    "tools": [{
        "type": "function",
        "name": "get_weather",
        "parameters": {"type": "object", "properties": {}}
    }],
    "tool_choice": {"type": "function", "name": "get_weather"},
})

---

content = content or ""
RAW_BUFFERClick to expand / collapse

Summary

Forced tool choice can crash with AssertionError when a reasoning parser returns content=None.

This is reachable for parsers such as glm45 / DeepSeekV3ReasoningWithThinkingParser, whose full-output reasoning extraction returns (reasoning, None) when the model output contains only reasoning text or ends immediately after </think>.

Affected code

  • vllm/entrypoints/openai/engine/serving.py
  • vllm/parser/abstract_parser.py

Both paths currently do this for forced function calls:

  • named chat tool choice: {"type":"function","function":{"name":"..."}}
  • responses tool choice: {"type":"function","name":"..."}

They assert that content is not None before constructing FunctionCall(...).

Why this is a bug

Reasoning extraction is allowed to consume the entire model output and return content=None. For example, the basic thinking parser returns (reasoning, None) when no post-reasoning content exists.

That makes the following assertion reachable and it bubbles up as a server-side failure instead of a valid tool-call response with empty arguments.

Minimal reproducer

Chat-completions code path

from vllm.entrypoints.openai.chat_completion.protocol import ChatCompletionRequest
from vllm.entrypoints.openai.engine.serving import OpenAIServing

req = ChatCompletionRequest.model_validate({
    "model": "test-model",
    "messages": [{"role": "user", "content": "test"}],
    "tools": [{
        "type": "function",
        "function": {
            "name": "get_weather",
            "parameters": {"type": "object", "properties": {}}
        }
    }],
    "tool_choice": {"type": "function", "function": {"name": "get_weather"}},
})

OpenAIServing._parse_tool_calls_from_content(
    request=req,
    tokenizer=None,
    enable_auto_tools=True,
    tool_parser_cls=None,
    content=None,
)

Responses code path

from vllm.entrypoints.openai.responses.protocol import ResponsesRequest
from vllm.parser.abstract_parser import DelegatingParser

# Any DelegatingParser instance reproduces this in _parse_tool_calls(...).
req = ResponsesRequest.model_validate({
    "model": "test-model",
    "input": "test",
    "tools": [{
        "type": "function",
        "name": "get_weather",
        "parameters": {"type": "object", "properties": {}}
    }],
    "tool_choice": {"type": "function", "name": "get_weather"},
})

Expected behavior

Forced tool choice should not assert when content is None.

The parser should normalize None to "" and emit a function call with empty arguments, matching the existing tolerant handling already used by the tool_choice="required" branch.

Proposed fix

Replace the assertions with:

content = content or ""

in both forced-tool-choice branches.

Validation

I have a local patch with two regression tests that cover:

  • chat-completions named tool_choice with content=None
  • responses named tool_choice with content=None

Both fail with AssertionError before the fix and pass after the fix.

extent analysis

TL;DR

Replace the assertions in the forced-tool-choice branches with content = content or "" to handle None values.

Guidance

  • Identify the affected code paths in vllm/entrypoints/openai/engine/serving.py and vllm/parser/abstract_parser.py where the assertions are made.
  • Replace the assertions with content = content or "" to normalize None values to empty strings.
  • Verify that the fix works by running the provided regression tests for chat-completions and responses code paths.
  • Consider adding additional tests to cover other scenarios where content might be None.

Example

content = content or ""

This line of code replaces the assertion and ensures that content is never None, preventing the AssertionError.

Notes

The proposed fix assumes that normalizing None to an empty string is the desired behavior, as indicated by the existing tolerant handling in the tool_choice="required" branch.

Recommendation

Apply the workaround by replacing the assertions with content = content or "", as this fix is specifically designed to address the issue with None values in the forced-tool-choice branches.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Forced tool choice should not assert when content is None.

The parser should normalize None to "" and emit a function call with empty arguments, matching the existing tolerant handling already used by the tool_choice="required" branch.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING