vllm

1877 issues found

[Bug]: explicit `parallel_tool_calls: null` is filtered like false instead of the documented "true" default

6/9/2026

[Bug]: vLLM v0.22.1 in 8 * H20-3e 141G environment deploying DeepSeek-V4-Pro reports error "mlir_global_dtors() missing 1 required positional argument: 'data'".

6/9/2026

[Bug]: [V1] `prompt_tokens_details` always `null` in API response despite `--enable-prompt-tokens-details` (14+ months, incomplete fix in PR #18149)

6/9/2026

[Bug]: [ROCm][FLA] chunk_gated_delta_rule Triton compilation fails on MI210/gfx90a with num_stages=4

6/9/2026

[Bug]: vllm-0.22.0 fail to run "Qwen/Qwen3.5-9B" in offline `LLM` mode

6/9/2026

[Bug]: vllm serve allendou/Fun-ASR-Nano-2512-vllm reports an error in version 0.22.1

6/9/2026

[RFC]: Fast Runtime Resize of a Pre-Built EP Topology

6/9/2026

[Usage]: vLLM logs orjson warning

6/9/2026

[Feature]: triton_scaled_mm uses an AMD-tuned fixed tile heuristic — leaves up to 1.82x on NVIDIA H800 (Hopper sm_90)

6/9/2026

[Bug]: Structured outputs + pipeline parallelism on the V1 model runner: xgrammar FSM desync ("Failed to advance FSM")

6/9/2026

[Feature Request] SLEM (String-Level Exact Match) speculative decoding for heterogeneous vocabularies

6/9/2026

[Bug]: nccl error: invalid usage

6/9/2026

[RFC] Add Top-nσ logit truncation example via custom logits processor

6/9/2026

[Installation]: Blocked by CVE-2025-30165 & CVE-2024-11041 (Legacy V0 Engine)

6/9/2026

[Bug]: TRITON_MLA grouped-decode kernel fails for Mistral Small 4 (kv_lora_rank=256): 'Cannot make_shape_compatible: 256 and 512'

6/9/2026

[Bug]: Kimi-K2.5 `AssertionError: Workspace is locked` On MI355X

6/9/2026

[RFC]: [Roadmap] Mooncake Store Connector Feature Enrichment Roadmap

6/9/2026

[Bug]: 'Gemma4UnifiedVisionConfig' object has no attribute 'num_soft_tokens'

6/9/2026

[Feature]: Restore support for 2-bit and 3-bit GPTQ

6/9/2026

[RFC]: Profiler support for CUDA graph capture tracing and roofline trace annotations

6/9/2026

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm

[Bug]: explicit `parallel_tool_calls: null` is filtered like false instead of the documented "true" default

[Bug]: vLLM v0.22.1 in 8 * H20-3e 141G environment deploying DeepSeek-V4-Pro reports error "mlir_global_dtors() missing 1 required positional argument: 'data'".

[Bug]: [V1] `prompt_tokens_details` always `null` in API response despite `--enable-prompt-tokens-details` (14+ months, incomplete fix in PR #18149)

[Bug]: [ROCm][FLA] chunk_gated_delta_rule Triton compilation fails on MI210/gfx90a with num_stages=4

[Bug]: vllm-0.22.0 fail to run "Qwen/Qwen3.5-9B" in offline `LLM` mode

[Bug]: vllm serve allendou/Fun-ASR-Nano-2512-vllm reports an error in version 0.22.1

[RFC]: Fast Runtime Resize of a Pre-Built EP Topology

[Usage]: vLLM logs orjson warning

[Feature]: triton_scaled_mm uses an AMD-tuned fixed tile heuristic — leaves up to 1.82x on NVIDIA H800 (Hopper sm_90)

[Bug]: Structured outputs + pipeline parallelism on the V1 model runner: xgrammar FSM desync ("Failed to advance FSM")

[Feature Request] SLEM (String-Level Exact Match) speculative decoding for heterogeneous vocabularies

[Bug]: nccl error: invalid usage

[RFC] Add Top-nσ logit truncation example via custom logits processor

[Installation]: Blocked by CVE-2025-30165 & CVE-2024-11041 (Legacy V0 Engine)

[Bug]: TRITON_MLA grouped-decode kernel fails for Mistral Small 4 (kv_lora_rank=256): 'Cannot make_shape_compatible: 256 and 512'

[Bug]: Kimi-K2.5 `AssertionError: Workspace is locked` On MI355X

[RFC]: [Roadmap] Mooncake Store Connector Feature Enrichment Roadmap

[Bug]: 'Gemma4UnifiedVisionConfig' object has no attribute 'num_soft_tokens'

[Feature]: Restore support for 2-bit and 3-bit GPTQ

[RFC]: Profiler support for CUDA graph capture tracing and roofline trace annotations