vllm - 💡(How to fix) Fix [Bug]: `vllm serve --config=...` accepts the config path but does not expand the YAML

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fix / Workaround

============================== CPU Info

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 6 On-line CPU(s) list: 0-5 Vendor ID: AuthenticAMD Model name: AMD EPYC Processor (with IBPB) CPU family: 23 Model: 1 Thread(s) per core: 1 Core(s) per socket: 6 Socket(s): 1 Stepping: 2 BogoMIPS: 5589.49 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext ssbd ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 arat Hypervisor vendor: KVM Virtualization type: full L1d cache: 192 KiB (6 instances) L1i cache: 384 KiB (6 instances) L2 cache: 3 MiB (6 instances) L3 cache: 8 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-5 Vulnerability Gather data sampling: Not affected Vulnerability Indirect target selection: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Reg file data sampling: Not affected Vulnerability Retbleed: Mitigation; untrained return thunk; SMT disabled Vulnerability Spec rstack overflow: Vulnerable: Safe RET, no microcode Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected Vulnerability Srbds: Not affected Vulnerability Tsa: Not affected Vulnerability Tsx async abort: Not affected Vulnerability Vmscape: Not affected

Code Example

Collecting environment information...
uv is set
==============================
        System Info
==============================
OS                           : Ubuntu 24.04.4 LTS (x86_64)
GCC version                  : (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0
Clang version                : 18.1.3 (1ubuntu1)
CMake version                : Could not collect
Libc version                 : glibc-2.39

==============================
       PyTorch Info
==============================
PyTorch version              : 2.11.0+cpu
Is debug build               : False
CUDA used to build PyTorch   : None
ROCM used to build PyTorch   : N/A
XPU used to build PyTorch    : N/A

==============================
      Python Environment
==============================
Python version               : 3.12.3 (main, Mar 23 2026, 19:04:32) [GCC 13.3.0] (64-bit runtime)
Python platform              : Linux-6.8.0-111-generic-x86_64-with-glibc2.39


==============================
          CPU Info
==============================
Architecture:                            x86_64
CPU op-mode(s):                          32-bit, 64-bit
Address sizes:                           48 bits physical, 48 bits virtual
Byte Order:                              Little Endian
CPU(s):                                  6
On-line CPU(s) list:                     0-5
Vendor ID:                               AuthenticAMD
Model name:                              AMD EPYC Processor (with IBPB)
CPU family:                              23
Model:                                   1
Thread(s) per core:                      1
Core(s) per socket:                      6
Socket(s):                               1
Stepping:                                2
BogoMIPS:                                5589.49
Flags:                                   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext ssbd ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 arat
Hypervisor vendor:                       KVM
Virtualization type:                     full
L1d cache:                               192 KiB (6 instances)
L1i cache:                               384 KiB (6 instances)
L2 cache:                                3 MiB (6 instances)
L3 cache:                                8 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-5
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Mitigation; untrained return thunk; SMT disabled
Vulnerability Spec rstack overflow:      Vulnerable: Safe RET, no microcode
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; Retpolines; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

==============================
Versions of relevant libraries
==============================
[pip3] numpy==2.3.5
[pip3] pyzmq==27.1.0
[pip3] torch==2.11.0+cpu
[pip3] transformers==5.9.0
[pip3] triton==3.7.0
[conda] Could not collect

==============================
         vLLM Info
==============================
ROCM Version                 : Could not collect
vLLM Version                 : 0.21.0
vLLM Build Flags:
  CUDA Archs: Not Set; ROCm: Disabled; XPU: Disabled
GPU Topology:
  Could not collect

==============================
     Environment Variables
==============================
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1

---

not any(re.match(r"^--config(=.+|$)", arg) for arg in args)

---

if "--config" in args:
    args = self._pull_args_from_config(args)

---

from pathlib import Path
from tempfile import TemporaryDirectory

from vllm.utils.argparse_utils import FlexibleArgumentParser


def make_serve_parser():
    parser = FlexibleArgumentParser(prog="vllm", add_json_tip=False)
    subparsers = parser.add_subparsers(dest="subparser")
    serve_parser = subparsers.add_parser("serve")
    serve_parser.add_argument("model_tag", nargs="?")
    serve_parser.add_argument("--config")
    serve_parser.add_argument("--model")
    serve_parser.add_argument("--port", type=int, default=8000)
    return parser


with TemporaryDirectory() as tmp:
    config_path = Path(tmp) / "serve.yaml"
    config_path.write_text("model: model-from-yaml\nport: 8123\n")

    equals_ns = make_serve_parser().parse_args(
        ["serve", f"--config={config_path}"]
    )
    spaced_ns = make_serve_parser().parse_args(
        ["serve", "--config", str(config_path)]
    )

    print("--config=... model:", equals_ns.model)
    print("--config=... port:", equals_ns.port)
    print("--config path model:", spaced_ns.model)
    print("--config path port:", spaced_ns.port)

---

vllm serve --config serve.yaml
vllm serve --config=serve.yaml

---

<REPRO_DIR>/.venv/bin/python <REPRO_DIR>/001_config_equals_bypasses_yaml_expansion_repro.py

---

parser class: vllm.utils.argparse_utils.FlexibleArgumentParser
installed vLLM version: 0.21.0+cpu
--config=... retained CLI config path: <TMPDIR>/serve.yaml
--config=... model from YAML: None
--config=... port from YAML: 8000
--config path model from YAML: model-from-yaml
--config path port from YAML: 8123
bug reproduced: True
RAW_BUFFERClick to expand / collapse

Your current environment

Environment

Reproduced against the latest official CPU release wheel: vllm==0.21.0+cpu.

Collecting environment information...
uv is set
==============================
        System Info
==============================
OS                           : Ubuntu 24.04.4 LTS (x86_64)
GCC version                  : (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0
Clang version                : 18.1.3 (1ubuntu1)
CMake version                : Could not collect
Libc version                 : glibc-2.39

==============================
       PyTorch Info
==============================
PyTorch version              : 2.11.0+cpu
Is debug build               : False
CUDA used to build PyTorch   : None
ROCM used to build PyTorch   : N/A
XPU used to build PyTorch    : N/A

==============================
      Python Environment
==============================
Python version               : 3.12.3 (main, Mar 23 2026, 19:04:32) [GCC 13.3.0] (64-bit runtime)
Python platform              : Linux-6.8.0-111-generic-x86_64-with-glibc2.39


==============================
          CPU Info
==============================
Architecture:                            x86_64
CPU op-mode(s):                          32-bit, 64-bit
Address sizes:                           48 bits physical, 48 bits virtual
Byte Order:                              Little Endian
CPU(s):                                  6
On-line CPU(s) list:                     0-5
Vendor ID:                               AuthenticAMD
Model name:                              AMD EPYC Processor (with IBPB)
CPU family:                              23
Model:                                   1
Thread(s) per core:                      1
Core(s) per socket:                      6
Socket(s):                               1
Stepping:                                2
BogoMIPS:                                5589.49
Flags:                                   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext ssbd ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 arat
Hypervisor vendor:                       KVM
Virtualization type:                     full
L1d cache:                               192 KiB (6 instances)
L1i cache:                               384 KiB (6 instances)
L2 cache:                                3 MiB (6 instances)
L3 cache:                                8 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-5
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Mitigation; untrained return thunk; SMT disabled
Vulnerability Spec rstack overflow:      Vulnerable: Safe RET, no microcode
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; Retpolines; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

==============================
Versions of relevant libraries
==============================
[pip3] numpy==2.3.5
[pip3] pyzmq==27.1.0
[pip3] torch==2.11.0+cpu
[pip3] transformers==5.9.0
[pip3] triton==3.7.0
[conda] Could not collect

==============================
         vLLM Info
==============================
ROCM Version                 : Could not collect
vLLM Version                 : 0.21.0
vLLM Build Flags:
  CUDA Archs: Not Set; ROCm: Disabled; XPU: Disabled
GPU Topology:
  Could not collect

==============================
     Environment Variables
==============================
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1

🐛 Describe the bug

Describe the bug

vllm serve treats --config=serve.yaml as evidence that a config was supplied, but the YAML config is not expanded into CLI arguments.

The parser has two different checks:

not any(re.match(r"^--config(=.+|$)", arg) for arg in args)

recognizes both --config path and --config=path, while:

if "--config" in args:
    args = self._pull_args_from_config(args)

only expands the two-token spelling. As a result, --config=serve.yaml is parsed as a raw config option value, but the YAML file contents are never inserted into the argument stream.

Reproduction

The E2E repro uses the installed vLLM release wheel and the real vllm.utils.argparse_utils.FlexibleArgumentParser. It builds a minimal serve parser so no model is loaded and no server is started.

from pathlib import Path
from tempfile import TemporaryDirectory

from vllm.utils.argparse_utils import FlexibleArgumentParser


def make_serve_parser():
    parser = FlexibleArgumentParser(prog="vllm", add_json_tip=False)
    subparsers = parser.add_subparsers(dest="subparser")
    serve_parser = subparsers.add_parser("serve")
    serve_parser.add_argument("model_tag", nargs="?")
    serve_parser.add_argument("--config")
    serve_parser.add_argument("--model")
    serve_parser.add_argument("--port", type=int, default=8000)
    return parser


with TemporaryDirectory() as tmp:
    config_path = Path(tmp) / "serve.yaml"
    config_path.write_text("model: model-from-yaml\nport: 8123\n")

    equals_ns = make_serve_parser().parse_args(
        ["serve", f"--config={config_path}"]
    )
    spaced_ns = make_serve_parser().parse_args(
        ["serve", "--config", str(config_path)]
    )

    print("--config=... model:", equals_ns.model)
    print("--config=... port:", equals_ns.port)
    print("--config path model:", spaced_ns.model)
    print("--config path port:", spaced_ns.port)

Actual behavior

--config=... retains the config path but does not load model or port from the YAML file. The equivalent --config path spelling loads both values.

Expected behavior

Both standard argparse spellings should load and expand the same YAML config:

vllm serve --config serve.yaml
vllm serve --config=serve.yaml

Latest-release E2E log

Command:

<REPRO_DIR>/.venv/bin/python <REPRO_DIR>/001_config_equals_bypasses_yaml_expansion_repro.py

Output:

parser class: vllm.utils.argparse_utils.FlexibleArgumentParser
installed vLLM version: 0.21.0+cpu
--config=... retained CLI config path: <TMPDIR>/serve.yaml
--config=... model from YAML: None
--config=... port from YAML: 8000
--config path model from YAML: model-from-yaml
--config path port from YAML: 8123
bug reproduced: True

Suggested fix

Normalize and expand both config spellings before normal argparse parsing. _pull_args_from_config() should accept both --config path.yaml and --config=path.yaml, reject duplicate config specifications across both forms, and preserve the existing CLI-over-config precedence.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Both standard argparse spellings should load and expand the same YAML config:

vllm serve --config serve.yaml
vllm serve --config=serve.yaml

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - 💡(How to fix) Fix [Bug]: `vllm serve --config=...` accepts the config path but does not expand the YAML