vllm - 💡(How to fix) Fix [Bug]: Chat completions emits empty tool_calls arrays after tool results [3 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fixed

Code Example

while assistant_output.tool_calls is not None:
    tool_call = assistant_output.tool_calls[0]

---

{
  "choices": [
    {
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "...final answer...",
        "tool_calls": []
      }
    }
  ]
}

---

{
  "choices": [
    {
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "...final answer..."
      }
    }
  ]
}
RAW_BUFFERClick to expand / collapse

Your current environment

Observed on both:

  • vLLM 0.20.2 pinned checkout: bc150f5
  • vLLM main/latest checkout tested locally: 7bd738988

🐛 Describe the bug

The OpenAI-compatible chat completions API can serialize a normal assistant response with an empty tool_calls array after the client returns a tool result.

The response is semantically a final assistant message:

  • finish_reason is "stop"
  • message.content / stream delta.content contains normal text
  • there is no tool call to execute

But the JSON payload still contains "tool_calls": []. In the OpenAI Python SDK this becomes a non-None message.tool_calls value, so common client loops treat the response as another tool-call response and then fail when indexing the empty list.

Example client failure:

while assistant_output.tool_calls is not None:
    tool_call = assistant_output.tool_calls[0]

Actual vLLM payload shape after the tool result:

{
  "choices": [
    {
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "...final answer...",
        "tool_calls": []
      }
    }
  ]
}

Expected payload shape:

{
  "choices": [
    {
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "...final answer..."
      }
    }
  ]
}

The same issue appears in streaming chunks as delta.tool_calls: [] alongside normal text deltas.

This is not a DeepSeek V4 tool-call parser failure and not a model accuracy issue. The model returns the expected final natural-language answer after the tool result; the API response serializer emits a misleading empty tool-call field.

Before submitting a new issue...

  • I have searched the existing issues and did not find a duplicate.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - 💡(How to fix) Fix [Bug]: Chat completions emits empty tool_calls arrays after tool results [3 pull requests]