vllm - ✅(Solved) Fix [RFC]: Replace Hardcoded Device Strings with current_platform and Implement Linting [2 pull requests, 3 comments, 3 participants]

wincent8 · 2026-04-15T05:21:37Z

[vllm] PR 37566: refactor hard coded device string in test files under tests/v1 and tests/lora - Repository: vllm-project/vllm - Author: wincent8 - State: clos… # PR #37566: refactor hard coded device string in test files under tests/v1 and tests/lora - Repository: vllm-project/vllm - Author: wincent8 - State: closed | merged: True - Link: https://github.com/vllm-project/vllm/pull/37566 ## Description (problem / solution / changelog) This PR replaces hardcoded "cuda" device strings with dynamic platform checks across tests/v1 and tests/lora. By utilizing current_platform.device_type, we enable these test suites to be reused across different hardware accelerators (e.g., ROCm, Gaudi, XPU) without manual modification. Currently, many tests in the V1 engine and LoRA modules are coupled specifically to CUDA. This makes it difficult to verify feature parity on non-NVIDIA hardware. This PR generalizes the device handling to ensure that "cuda-centric" code becomes "accelerator-agnostic." Proposed Changes I have implemented the following systematic replacements: - use **`DEVICE_TYPE`** (inferred from `current_platform.device_type`) replace hardcode **`cuda`** - use **`DEVICES`** (`[f"{DEVICE_TYPE}:{i}" for i in range(1 if current_platform.device_count() == 1 else 2)]`) replace **`CUDA_DEVICES`** (`[f"cuda:{i} for i in range(1 if current_platform.device_count() == 1 else 2)"]` Impact Test Parity: Allows non-CUDA CI pipelines to run the exact same V1 and LoRA validation logic used for NVIDIA GPUs. ## Test Plan CI ## Test Result --- Essential Elements of an Effective PR Description Checklist - [ ] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [ ] The test plan, such as providing test command. - [ ] The test results, such as pasting the results comparison before and after, or e2e results - [ ] (Optional) The necessary documentation update, such as updating `supported_models.md` and `examples` for a new model. - [ ] (Optional) Release notes update. If your change is user facing, please update the release notes draft in the [Google Doc](https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0). ## Changed files - `tests/lora/test_fused_moe_lora_kernel.py` (modified, +1/-1) - `tests/lora/test_layers.py` (modified, +6/-2) - `tests/lora/test_lora_manager.py` (modified, +2/-2) - `tests/lora/test_moe_lora_align_sum.py` (modified, +16/-6) - `tests/lora/test_punica_ops.py` (modified, +10/-3) - `tests/lora/test_punica_ops_fp8.py` (modified, +3/-1) - `tests/lora/test_worker.py` (modified, +4/-1) - `tests/lora/utils.py` (modified, +6/-3) - `tests/v1/attention/test_attention_backends.py` (modified, +3/-1) - `tests/v1/attention/test_chunked_local_attention.py` (modified, +4/-1) - `tests/v1/attention/test_mla_backends.py` (modified, +3/-1) - `tests/v1/attention/test_sparse_mla_backends.py` (modified, +6/-4) - `tests/v1/attention/test_trtllm_attention_integration.py` (modified, +2/-1) - `tests/v1/cudagraph/test_cudagraph_dispatch.py` (modified, +11/-9) - `tests/v1/determinism/test_rms_norm_batch_invariant.py` (modified, +10/-7) - `tests/v1/e2e/general/test_mamba_prefix_cache.py` (modified, +15/-5) - `tests/v1/kv_offload/test_cpu_gpu.py` (modified, +4/-2) - `tests/v1/logits_processors/test_correctness.py` (modified, +4/-3) - `tests/v1/sample/test_rejection_sampler.py` (modified, +60/-30) - `tests/v1/sample/test_sampler.py` (modified, +8/-7) - `tests/v1/sample/test_topk_topp_sampler.py` (modified, +8/-9) - `tests/v1/spec_decode/test_eagle.py` (modified, +10/-9) - `tests/v1/spec_decode/test_eagle_step_kernel.py` (modified, +6/-3) - `tests/v1/spec_decode/test_extract_hidden_states.py` (modified, +6/-5) - `tests/v1/spec_decode/test_mtp.py` (modified, +4/-3) - `tests/v1/spec_decode/test_tree_attention.py` (modified, +5/-3) - `tests/v1/worker/test_gpu_input_batch.py` (modified, +5/-7) - `tests/v1/worker/test_gpu_model_runner.py` (modified, +17/-17) --- # PR #38901: refactor hard coded device string in test files under tests/compile tests/quantization tests/models and tests/model_executor - Repository: vllm-project/vllm - Author: wincent8 - State: closed | merged: True - Link: https://github.com/vllm-project/vllm/pull/38901 ## Description (problem / solution / changelog) This PR replaces hardcoded "cuda" device strings with dynamic platform checks across tests/compile, tests/quantization, tests/models, tests/model_executor and tests/basic_correctness. By utilizing current_platform.device_type, we enable these test suites to be reused across different hardware accelerators (e.g., ROCm, Gaudi, XPU) without manual modification. Currently, many tests in the V1 engine and LoRA modules are coupled specifically to CUDA. This makes it difficult to verify feature parity on non-NVIDIA hardware. This PR generalizes the device handling to ensure that "cuda-centric" code becomes "accelerator-agnostic." Proposed Changes I have imple

vllm2026-04-15 05:21:37

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#39871•Fetched 2026-04-17 08:24:04

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

mentioned ×18subscribed ×18commented ×3labeled ×1

Error Message

Portability: Developers porting vLLM to new hardware must manually find and replace strings, which is error-prone.

RAW_BUFFERClick to expand / collapse

Motivation.

Currently, the vLLM codebase contains numerous instances of hardcoded device strings such as "cuda", "cuda:0", and .to("cuda"). This hinders our goal of being a truly multi-platform LLM engine (supporting ROCm, TPU, Gaudi/HPU, etc.)

Hardcoding "cuda" creates several issues: Portability: Developers porting vLLM to new hardware must manually find and replace strings, which is error-prone. Consistency: Some parts of the code use cuda, while others use current_platform. We need a single source of truth. Future-Proofing: As we move toward better abstraction, the device type should be a property of the environment, not a static string in the logic.

for those that want to limit the test on cuda or specific platform, we better use decorators to skip it on other platform instead of giving a hardcode device string. Refer https://github.com/vllm-project/vllm/issues/39158

Proposed Change.

We will replace all instances of:

use DEVICE_TYPE (inferred from current_platform.device_type) replace hardcode cuda
use DEVICES ([f"{DEVICE_TYPE}:{i}" for i in range(1 if current_platform.device_count() == 1 else 2)]) replace CUDA_DEVICES ([f"cuda:{i} for i in range(1 if current_platform.device_count() == 1 else 2)"] can refer to https://github.com/vllm-project/vllm/pull/37566 and https://github.com/vllm-project/vllm/pull/38901

implement lint check for string "cuda" and "cuda*", except tests/kernel, vllm/platforms, etc.

Feedback Period.

No response

CC List.

No response

Any Other Things.

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

Replace hardcoded "cuda" strings with DEVICE_TYPE inferred from current_platform.device_type to improve portability and consistency.

Guidance

Identify and replace all instances of hardcoded "cuda" strings with DEVICE_TYPE to create a single source of truth for device types.
Implement a lint check to detect and prevent future usage of hardcoded "cuda" strings, excluding specific directories like tests/kernel and vllm/platforms.
Use decorators to skip tests on specific platforms instead of hardcoding device strings, as suggested in the referenced GitHub issue.
Review the proposed changes and referenced pull requests (e.g., #37566 and #38901) to understand the implementation details.

Example

# Before
device = "cuda:0"

# After
DEVICE_TYPE = current_platform.device_type
device = f"{DEVICE_TYPE}:0"

Notes

The proposed change aims to improve the codebase's portability and consistency by replacing hardcoded device strings. However, the implementation details and potential edge cases should be carefully reviewed and tested.

Recommendation

Apply the proposed workaround by replacing hardcoded "cuda" strings with DEVICE_TYPE to improve the codebase's portability and consistency. This change is necessary to support multiple platforms and devices, making the code more future-proof.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - ✅(Solved) Fix [RFC]: Replace Hardcoded Device Strings with current_platform and Implement Linting [2 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #37566: refactor hard coded device string in test files under tests/v1 and tests/lora

Description (problem / solution / changelog)

Test Plan

Test Result

Changed files

PR #38901: refactor hard coded device string in test files under tests/compile tests/quantization tests/models and tests/model_executor

Description (problem / solution / changelog)

Test Plan

Test Result

Changed files

Motivation.

Proposed Change.

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

vllm - ✅(Solved) Fix [RFC]: Replace Hardcoded Device Strings with current_platform and Implement Linting [2 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #37566: refactor hard coded device string in test files under tests/v1 and tests/lora

Description (problem / solution / changelog)

Test Plan

Test Result

Changed files

PR #38901: refactor hard coded device string in test files under tests/compile tests/quantization tests/models and tests/model_executor

Description (problem / solution / changelog)

Test Plan

Test Result

Changed files

Motivation.

Proposed Change.

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING