vllm - ✅(Solved) Fix [Bug]: 0.19.0 rocm+7900xtx: Failed to infer device type [1 pull requests, 4 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#39378Fetched 2026-04-09 07:51:27
View on GitHub
Comments
4
Participants
3
Timeline
18
Reactions
0
Timeline (top)
mentioned ×5subscribed ×5commented ×4labeled ×2

Error Message

Traceback (most recent call last):

PR fix notes

PR #39653: [ROCm] Improve failed device detection diagnostics

Description (problem / solution / changelog)

Summary

Improve the Failed to infer device type error path so it reports the detected torch runtime and, when the environment looks like ROCm but torch does not expose HIP support, points users to the likely wheel/runtime mismatch instead of only telling them to enable debug logging.

This adds a small pure-Python helper for the diagnostic text and narrow regression coverage for the helper logic.

Why this is not duplicate work

I checked:

  • issue comments on #39378
  • open PRs referencing 39378
  • open PRs searching for Failed to infer device type
  • open PRs searching for device type ROCm wheel

The only open overlap I found is my docs PR #39639, which updates installation and troubleshooting docs for the same failure mode. This PR is the runtime-side complement: it changes the actual error users see when platform detection fails.

Tests run

.venv/bin/python -m pytest --noconftest tests/config/test_device_diagnostics.py -q
.venv/bin/python -m py_compile vllm/config/device.py vllm/config/device_diagnostics.py tests/config/test_device_diagnostics.py
git diff --check

Results:

  • tests/config/test_device_diagnostics.py: 3 passed
  • py_compile: passed
  • git diff --check: passed, aside from CRLF warnings from this Windows checkout

AI assistance

This PR was prepared with AI assistance. I reviewed the changes and test results before submission.

Changed files

  • tests/config/test_device_diagnostics.py (added, +59/-0)
  • vllm/config/device.py (modified, +20/-8)
  • vllm/config/device_diagnostics.py (added, +78/-0)
RAW_BUFFERClick to expand / collapse

Your current environment

vLLM:0.19.0 rocm:7.2

🐛 Describe the bug

(base) root@kittyzero:# dpkg -l|grep rocm ii rocm 7.2.1.70201-8124.04 amd64 Radeon Open Compute (ROCm) software stack meta package ii rocm-cmake 0.14.0.70201-8124.04 amd64 rocm-cmake built using CMake ii rocm-core 7.2.1.70201-8124.04 amd64 ROCm Runtime software stack ii rocm-dbgapi 0.77.4.70201-8124.04 amd64 Library to provide AMD GPU debugger API ii rocm-debug-agent 2.1.0.70201-8124.04 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent) ii rocm-developer-tools 7.2.1.70201-8124.04 amd64 Radeon Open Compute (ROCm) Runtime software stack ii rocm-device-libs 1.0.0.70201-8124.04 amd64 Radeon Open Compute - device libraries ii rocm-gdb 16.3.70201-8124.04 amd64 ROCgdb ii rocm-hip 7.2.1.70201-8124.04 amd64 Radeon Open Compute (ROCm) Runtime software stack ii rocm-llvm 22.0.0.26084.70201-8124.04 amd64 ROCm core compiler ii rocm-opencl 2.0.0.70201-8124.04 amd64 clr built using CMake ii rocm-opencl-dev 2.0.0.70201-8124.04 amd64 clr built using CMake ii rocm-opencl-sdk 7.2.1.70201-8124.04 amd64 Radeon Open Compute (ROCm) Runtime software stack ii rocm-openmp 7.2.1.70201-8124.04 amd64 Radeon Open Compute (ROCm) OpenMP Software development Kit. ii rocm-smi-lib 7.8.0.70201-8124.04 amd64 AMD System Management libraries ii rocminfo 1.0.0.70201-81~24.04 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool

(vLLM) root@kittyzero:~# python -m vllm.entrypoints.openai.api_server \

--model Qwen/Qwen3.5-9B-Instruct-AWQ
--quantization awq
--gpu-memory-utilization 0.85
--max-model-len 22000
--enable-auto-tool-choice
--tool-call-parser hermes Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/root/miniconda3/envs/vLLM/lib/python3.13/site-packages/vllm/entrypoints/openai/api_server.py", line 706, in <module> parser = make_arg_parser(parser) File "/root/miniconda3/envs/vLLM/lib/python3.13/site-packages/vllm/entrypoints/openai/cli_args.py", line 349, in make_arg_parser parser = AsyncEngineArgs.add_cli_args(parser) File "/root/miniconda3/envs/vLLM/lib/python3.13/site-packages/vllm/engine/arg_utils.py", line 2264, in add_cli_args parser = EngineArgs.add_cli_args(parser) File "/root/miniconda3/envs/vLLM/lib/python3.13/site-packages/vllm/engine/arg_utils.py", line 1294, in add_cli_args vllm_kwargs = get_kwargs(VllmConfig) File "/root/miniconda3/envs/vLLM/lib/python3.13/site-packages/vllm/engine/arg_utils.py", line 369, in get_kwargs return copy.deepcopy(_compute_kwargs(cls)) ~~~~~~~~~~~~~~~^^^^^ File "/root/miniconda3/envs/vLLM/lib/python3.13/site-packages/vllm/engine/arg_utils.py", line 280, in _compute_kwargs default = default.default_factory() # type: ignore[call-arg] File "/root/miniconda3/envs/vLLM/lib/python3.13/site-packages/pydantic/_internal/_dataclasses.py", line 121, in init s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/vLLM/lib/python3.13/site-packages/vllm/config/device.py", line 56, in post_init raise RuntimeError( ...<3 lines>... ) RuntimeError: Failed to infer device type, please set the environment variable VLLM_LOGGING_LEVEL=DEBUG to turn on verbose logging to help debug the issue.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

Set the environment variable VLLM_LOGGING_LEVEL=DEBUG to enable verbose logging and help debug the device type inference issue.

Guidance

  • Set VLLM_LOGGING_LEVEL=DEBUG before running the python -m vllm.entrypoints.openai.api_server command to gather more detailed logs.
  • Verify that the ROCm software stack is properly installed and configured by checking the output of dpkg -l|grep rocm.
  • Check the documentation for any specific requirements or troubleshooting steps related to device type inference.
  • Review the logs generated with the debug logging level to identify the root cause of the device type inference failure.

Example

export VLLM_LOGGING_LEVEL=DEBUG
python -m vllm.entrypoints.openai.api_server \
  --model Qwen/Qwen3.5-9B-Instruct-AWQ \
  --quantization awq \
  --gpu-memory-utilization 0.85 \
  --max-model-len 22000 \
  --enable-auto-tool-choice \
  --tool-call-parser hermes

Notes

The provided error message suggests that the issue is related to device type inference, but the root cause is not immediately clear. Enabling debug logging should provide more information to help diagnose the problem.

Recommendation

Apply workaround: Set VLLM_LOGGING_LEVEL=DEBUG to gather more information about the issue, as this will provide more detailed logs to help identify the root cause of the device type inference failure.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING