vllm - ✅(Solved) Fix [Bug]: Streaming last chunk contains non-empty tool_calls with empty fields "type" causing type validation error [2 pull requests, 17 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#38603Fetched 2026-04-08 01:58:59
View on GitHub
Comments
17
Participants
4
Timeline
34
Reactions
0
Timeline (top)
commented ×17subscribed ×6mentioned ×5cross-referenced ×2

Error Message

[2026-03-27 16:33:36.770] [Info] [BaseAgent:craft] _handleStepError: state changed from running to error, messageId: 9f27290b4e514e18bbbc1f026380a951, stack: Error message: [

Fix Action

Fixed

PR fix notes

PR #38609: [Bugfix] Fix streaming tool call type field defaulting to None instead of "function"

Description (problem / solution / changelog)

Purpose

Fixes https://github.com/vllm-project/vllm/issues/38603 - the type field in streaming tool call delta chunks to default to "function" when no matching original tool call is found, per OpenAI API spec

Test Plan

pytest tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta -v

Changed files

  • tests/entrypoints/openai/chat_completion/test_serving_chat.py (modified, +1/-1)
  • vllm/entrypoints/openai/chat_completion/serving.py (modified, +1/-1)

PR #38640: [BugFix] Fix streaming tool call with null type or id in final chunk

Description (problem / solution / changelog)

<!-- markdownlint-disable -->

Purpose

Fix #38603

Streaming tool call final chunk sends {"type": null, "id": null, "name": null} in the tool_calls delta, causing AI_TypeValidationError in strict OpenAI-compatible clients. It generally happens with mtp.

Root cause

_create_remaining_args_delta in vllm/entrypoints/openai/chat_completion/serving.py is called when the final engine step also produces tool call content. It creates a replacement DeltaToolCall to stream remaining arguments. For a continuation chunk, which is the tool parser's last delta carries only arguments, the original code explicitly passed None for those fields. The streaming loop serializes chunks with model_dump_json(exclude_unset=True), which includes set-but-None fields as JSON null. Clients therefore receive null type.

The OpenAI OpenAPI spec ChatCompletionMessageToolCallChunk defines id and type as optional but not nullable. Only index is required. Sending null for a non-nullable field violates the schema.

Test Plan

I have added a new test for streaming output.

pytest tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta -v

Test Result

Before fix

============================================================================ test session starts =============================================================================
platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0 -- /root/miniconda3/bin/python
cachedir: .pytest_cache
rootdir: /root/autodl-tmp/vllm
configfile: pyproject.toml
plugins: asyncio-1.3.0, anyio-4.9.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 5 items                                                                                                                                                            

tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_preserves_id_type_name PASSED                                        [ 20%]
tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_matches_by_index PASSED                                              [ 40%]
tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_no_matching_tool_call PASSED                                         [ 60%]
tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_function_is_none PASSED                                              [ 80%]
tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_continuation_chunk_no_null_type FAILED                               [100%]
================================================================================== FAILURES ==================================================================================
_____________________________________________________ TestCreateRemainingArgsDelta.test_continuation_chunk_no_null_type ______________________________________________________

self = <tests.entrypoints.openai.chat_completion.test_serving_chat.TestCreateRemainingArgsDelta object at 0x7f6ffe9021e0>

    def test_continuation_chunk_no_null_type(self):
        """Test that continuation chunks (no id/type) don't serialize null type.
    
        Regression test for https://github.com/vllm-project/vllm/issues/38603:
        the final streaming chunk with finish_reason='tool_calls' contained
        type=null in the tool_calls delta, causing AI_TypeValidationError in
        strict clients such as Vercel AI SDK.
    
        Per the OpenAI streaming spec, type/id only appear in the first chunk
        for each tool call; continuation chunks must omit them entirely (not
        send null) so that exclude_unset=True serialization excludes them.
        """
        from vllm.entrypoints.openai.chat_completion.serving import OpenAIServingChat
        from vllm.entrypoints.openai.engine.protocol import (
            DeltaFunctionCall,
            DeltaMessage,
            DeltaToolCall,
        )
    
        # Simulate a continuation chunk: the tool parser's last delta has
        # arguments but no id/type/name (those appeared in the first chunk).
        original_delta = DeltaMessage(
            tool_calls=[
                DeltaToolCall(
                    index=0,
                    function=DeltaFunctionCall(arguments='{"location": "Par'),
                )
            ]
        )
    
        result = OpenAIServingChat._create_remaining_args_delta(
            original_delta, 'is"}', 0
        )
    
        assert len(result.tool_calls) == 1
        tc = result.tool_calls[0]
        assert tc.index == 0
        assert tc.id is None
        assert tc.type is None
        assert tc.function.name is None
        assert tc.function.arguments == 'is"}'
    
        # The key assertion: null fields must NOT appear in serialized output.
        # Before the fix, id/type/name were explicitly set to None and would
        # serialize as {"id": null, "type": null, ...} failing strict clients.
        serialized = tc.model_dump(exclude_unset=True)
>       assert "id" not in serialized, (
            "id should not appear in serialized continuation chunk"
        )
E       AssertionError: id should not appear in serialized continuation chunk
E       assert 'id' not in {'function': {'arguments': 'is"}', 'name': None}, 'id': None, 'index': 0, 'type': None}

tests/entrypoints/openai/chat_completion/test_serving_chat.py:264: AssertionError
============================================================================== warnings summary ==============================================================================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

../../miniconda3/lib/python3.12/site-packages/torch/jit/_script.py:362: 14 warnings
  /root/miniconda3/lib/python3.12/site-packages/torch/jit/_script.py:362: DeprecationWarning: `torch.jit.script_method` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================== short test summary info ===========================================================================
FAILED tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_continuation_chunk_no_null_type - AssertionError: id should not appear in serialized continuation chunk

After fix — all tests in TestCreateRemainingArgsDelta pass:

============================================================================ test session starts =============================================================================
platform linux -- Python 3.12.3, pytest-9.0.2, pluggy-1.6.0 -- /root/miniconda3/bin/python
cachedir: .pytest_cache
rootdir: /root/autodl-tmp/vllm
configfile: pyproject.toml
plugins: asyncio-1.3.0, anyio-4.9.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 5 items                                                                                                                                                            

tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_preserves_id_type_name PASSED                                        [ 20%]
tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_matches_by_index PASSED                                              [ 40%]
tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_no_matching_tool_call PASSED                                         [ 60%]
tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_function_is_none PASSED                                              [ 80%]
tests/entrypoints/openai/chat_completion/test_serving_chat.py::TestCreateRemainingArgsDelta::test_continuation_chunk_no_null_type PASSED                               [100%]

============================================================================== warnings summary ==============================================================================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

../../miniconda3/lib/python3.12/site-packages/torch/jit/_script.py:362: 14 warnings
  /root/miniconda3/lib/python3.12/site-packages/torch/jit/_script.py:362: DeprecationWarning: `torch.jit.script_method` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================================================================= 5 passed, 16 warnings in 3.25s =======================================================================
sys:1: DeprecationWarning: builtin type swigvarlink has no __module__ attribute

<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
</details>

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)

Changed files

  • tests/entrypoints/openai/chat_completion/test_serving_chat.py (modified, +43/-0)
  • vllm/entrypoints/openai/chat_completion/serving.py (modified, +11/-13)
RAW_BUFFERClick to expand / collapse

Your current environment

Environment vLLM version: v0.17.0.rc1 (with vLLM ascend plugin)

Model: Qwen3.5-397B w8a8

Hardware: Atlas800I A2 * 2

Deployment: vllm serve with tool calling enabled vllm serve /data/Qwen3__5-397B-A17B-w8a8-mtp –host 0.0.0.0 –port 8005 –headless –data-parallel-size 2 –data-parallel-size-local 1 –data-parallel-start-rank 1 –data-parallel-address $node0_ip –data-parallel-rpc-port 13389 –seed 1024 –tensor-parallel-size 8 –served-model-name qwen3.5 –max-num-seqs 16 –max-model-len 196608 –max-num-batched-tokens 8096 –reasoning-parser qwen3 –enable-auto-tool-choice –tool-call-parser qwen3_coder –enable-expert-parallel –trust-remote-code –async-scheduling –gpu-memory-utilization 0.9 –no-enable-prefix-caching –speculative-config ‘{“method”: “qwen3_5_mtp”, “num_speculative_tokens”: 3, “enforce_eager”: true}’ –compilation-config ‘{“cudagraph_mode”:“FULL_DECODE_ONLY”}’ –additional-config ‘{“enable_cpu_binding”:true, “multistream_overlap_shared_expert”: true}’

🐛 Describe the bug

Problem When streaming tool calls, the last chunk (with finish_reason: "tool_calls") contains a non-empty tool_calls array where all fields are empty strings. This causes client-side type validation to fail with AI_TypeValidationError.

[["id":"chatcmpl-9f27290b4e514e18bbbc1f026380a951","model":"qwen3.5","object":"chat.completion.chunk","created":1774600400,"choices":[{"index":0,"delta":{"role":"assistant","content":"","reasoning_content":"","function_call":{"name":"","arguments":""},"refusal":"","tool_calls":[{"id":"","type":"","function":{"name":"","arguments":""}}],"index":1}},{"extra_fields":null},"logprobs":null,"finish_reason":"tool_calls"}],"usage":{"prompt_tokens":44617,"completion_tokens":167,"total_tokens":44784,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0,"cached_tokens":0,"prompt_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0,"cached_tokens":0},"prompt_cache_hit_tokens":0,"prompt_cache_miss_tokens":44617,"cache_read_input_tokens":0,"cache_creation_input_tokens":0,"prompt_cache_write_tokens":0,"completion_thinking_tokens":0,"credit":0.45}}} [2026-03-27 16:33:36.767] [Info] [BaseAgent:craft] agent model: {"id":"qwen3.5","name":"qwen3.5","baseUrl":"****","apiKey":"","supportsImages":true,"disabledMultimodal":false,"maxInputTokens":64000,"maxAllowedSize":192000,"maxOutputTokens":128000,"maxImageCount":20} [2026-03-27 16:33:36.770] [Info] [BaseAgent:craft] _handleStepError: state changed from running to error, messageId: 9f27290b4e514e18bbbc1f026380a951, stack: AI_TypeValidationError: Type validation failed: Value {"id":"chatcmpl-9f27290b4e514e18bbbc1f026380a951","model":"qwen3.5","object":"chat.completion.chunk","created":1774600400,"choices":[{"index":0,"delta":{"role":"assistant","content":"","reasoning_content":"","function_call":{"name":"","arguments":""},"refusal":"","tool_calls":[{"id":"","type":"","function":{"name":"","arguments":""}}],"index":1}},{"extra_fields":null},"logprobs":null,"finish_reason":"tool_calls"}],"usage":{"prompt_tokens":44617,"completion_tokens":167,"total_tokens":44784,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0,"cached_tokens":0,"prompt_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0,"cached_tokens":0},"prompt_cache_hit_tokens":0,"prompt_cache_miss_tokens":44617,"cache_read_input_tokens":0,"cache_creation_input_tokens":0,"prompt_cache_write_tokens":0,"completion_thinking_tokens":0,"credit":0.45}}). Error message: [ "code": "invalid_union", "unionErrors": [

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

The issue can be fixed by modifying the tool_calls array in the last chunk to either remove empty strings or populate the required fields, ensuring client-side type validation passes.

Guidance

  • Verify that the tool_calls array is being populated correctly on the server-side before sending the response to the client.
  • Check the server-side logic for handling finish_reason: "tool_calls" to ensure it correctly constructs the tool_calls array.
  • Consider adding server-side validation to ensure the tool_calls array is not empty and contains the required fields before sending the response.
  • If the issue persists, try modifying the client-side type validation to handle empty strings in the tool_calls array.

Example

No code snippet is provided as the issue does not contain sufficient information about the server-side implementation.

Notes

The root cause of the issue appears to be related to the server-side construction of the tool_calls array. However, without more information about the server-side implementation, it is difficult to provide a more specific solution.

Recommendation

Apply a workaround by modifying the server-side logic to correctly construct the tool_calls array, ensuring it is not empty and contains the required fields. This will prevent the client-side type validation error.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING