vllm - 💡(How to fix) Fix [Bug]: input_audio content with uuid is parsed incorrectly [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

AssertionError: Expected code to be unreachable, but got: None

Root Cause

This happens because the uuid case enters the multimodal compatibility parsing path. In that path, vLLM passes the whole content part to the audio parser instead of the nested input_audio object.

Fix Action

Fixed

Code Example

Collecting environment information...

  System Info
  OS                           : Ubuntu 22.04.5 LTS (x86_64)
  GCC version                  : 11.4.0
  Libc version                 : glibc-2.35

  PyTorch Info
  PyTorch version              : 2.11.0+cu130
  CUDA used to build PyTorch   : 13.0

  Python Environment
  Python version               : 3.12.13
  Python platform              : Linux-6.8.0-107-generic-x86_64-with-glibc2.35

  CUDA / GPU Info
  Is CUDA available            : True
  GPU models and configuration :
  GPU 0-7                      : NVIDIA A100-SXM4-80GB
  Nvidia driver version        : 580.126.20

  CPU Info
  CPU(s)                       : 176
  Model name                   : Intel(R) Xeon(R) Platinum 8458P
  Socket(s)                    : 2
  Core(s) per socket           : 44
  Thread(s) per core           : 2

  Versions of relevant libraries
  flashinfer-python            : 0.6.8.post1
  numpy                        : 2.2.6
  torch                        : 2.11.0+cu130
  torchaudio                   : 2.11.0+cu130
  torchvision                  : 0.26.0+cu130
  transformers                 : 5.8.1
  triton                       : 3.6.0

  vLLM Info
  vLLM Version                 : 0.21.0
  CUDA Archs                   : 7.5 8.0 8.6 8.9 9.0 10.0 12.0+PTX
  ROCm                         : Disabled
  XPU                          : Disabled

  Environment Variables
  NVIDIA_VISIBLE_DEVICES       : all
  CUDA_VERSION                 : 13.0.2
  VLLM_USAGE_SOURCE            : production-docker-image
  VLLM_ENABLE_CUDA_COMPATIBILITY: 0
  LD_LIBRARY_PATH              : /usr/local/nvidia/lib64:/usr/local/cuda/lib64:...
  TORCHINDUCTOR_CACHE_DIR      : /tmp/torchinductor_root

---

AssertionError: Expected code to be unreachable, but got: None

---

input_audio_params = cast(dict[str, str], part)

---

input_audio_params = cast(InputAudio, part["input_audio"])

---

{
  "model": "gemma-4-E2B-it",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Please describe this audio in one sentence."
        },
        {
          "type": "input_audio",
          "input_audio": {
            "data": "<base64_audio>",
            "format": "wav"
          },
          "uuid": "audio-smoke-uuid-001"
        }
      ]
    }
  ],
  "max_tokens": 16,
  "temperature": 0
}

---

{
  "type": "input_audio",
  "input_audio": {
    "data": "<base64_audio>",
    "format": "wav"
  },
  "uuid": "audio-smoke-uuid-001"
}

---

Expected code to be unreachable, but got: None

---

{
  "message": "Expected code to be unreachable, but got: None",
  "type": "InternalServerError",
  "param": null,
  "code": 500
}
RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>
Collecting environment information...

  System Info
  OS                           : Ubuntu 22.04.5 LTS (x86_64)
  GCC version                  : 11.4.0
  Libc version                 : glibc-2.35

  PyTorch Info
  PyTorch version              : 2.11.0+cu130
  CUDA used to build PyTorch   : 13.0

  Python Environment
  Python version               : 3.12.13
  Python platform              : Linux-6.8.0-107-generic-x86_64-with-glibc2.35

  CUDA / GPU Info
  Is CUDA available            : True
  GPU models and configuration :
  GPU 0-7                      : NVIDIA A100-SXM4-80GB
  Nvidia driver version        : 580.126.20

  CPU Info
  CPU(s)                       : 176
  Model name                   : Intel(R) Xeon(R) Platinum 8458P
  Socket(s)                    : 2
  Core(s) per socket           : 44
  Thread(s) per core           : 2

  Versions of relevant libraries
  flashinfer-python            : 0.6.8.post1
  numpy                        : 2.2.6
  torch                        : 2.11.0+cu130
  torchaudio                   : 2.11.0+cu130
  torchvision                  : 0.26.0+cu130
  transformers                 : 5.8.1
  triton                       : 3.6.0

  vLLM Info
  vLLM Version                 : 0.21.0
  CUDA Archs                   : 7.5 8.0 8.6 8.9 9.0 10.0 12.0+PTX
  ROCm                         : Disabled
  XPU                          : Disabled

  Environment Variables
  NVIDIA_VISIBLE_DEVICES       : all
  CUDA_VERSION                 : 13.0.2
  VLLM_USAGE_SOURCE            : production-docker-image
  VLLM_ENABLE_CUDA_COMPATIBILITY: 0
  LD_LIBRARY_PATH              : /usr/local/nvidia/lib64:/usr/local/cuda/lib64:...
  TORCHINDUCTOR_CACHE_DIR      : /tmp/torchinductor_root
</details>

🐛 Describe the bug

When a request to the OpenAI-compatible /v1/chat/completions endpoint contains an input_audio content part with a uuid, vLLM returns HTTP 500:

AssertionError: Expected code to be unreachable, but got: None

This happens because the uuid case enters the multimodal compatibility parsing path. In that path, vLLM passes the whole content part to the audio parser instead of the nested input_audio object.

Current behavior:

input_audio_params = cast(dict[str, str], part)

Expected behavior:

input_audio_params = cast(InputAudio, part["input_audio"])

The non-uuid input_audio path already returns part["input_audio"], so the uuid path should use the same payload shape.

Minimal Reproducible Example

Send this request to /v1/chat/completions:

{
  "model": "gemma-4-E2B-it",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Please describe this audio in one sentence."
        },
        {
          "type": "input_audio",
          "input_audio": {
            "data": "<base64_audio>",
            "format": "wav"
          },
          "uuid": "audio-smoke-uuid-001"
        }
      ]
    }
  ],
  "max_tokens": 16,
  "temperature": 0
}

The key part is that the same input_audio content part carries a uuid:

{
  "type": "input_audio",
  "input_audio": {
    "data": "<base64_audio>",
    "format": "wav"
  },
  "uuid": "audio-smoke-uuid-001"
}

Expected behavior

vLLM should pass part["input_audio"] to the downstream audio parser and should not fail with:

Expected code to be unreachable, but got: None

If the audio payload itself is invalid or unsupported, the request should continue into the normal audio loading path and return the appropriate audio validation error.

Actual behavior

The request returns HTTP 500:

{
  "message": "Expected code to be unreachable, but got: None",
  "type": "InternalServerError",
  "param": null,
  "code": 500
}
  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

vLLM should pass part["input_audio"] to the downstream audio parser and should not fail with:

Expected code to be unreachable, but got: None

If the audio payload itself is invalid or unsupported, the request should continue into the normal audio loading path and return the appropriate audio validation error.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - 💡(How to fix) Fix [Bug]: input_audio content with uuid is parsed incorrectly [2 pull requests]