vllm - ✅(Solved) Fix [RFC][Test]: Unified Platform-Aware Test Skip Mechanism [3 pull requests, 2 comments, 3 participants]

jikunshang · 2026-04-07T06:51:49Z

[vllm] PR 38608: XPU Enable sequence parallel support for XPU - Repository: vllm-project/vllm - Author: chaojun-zhang - State: open | merged: False - Link: htt… # PR #38608: [XPU] Enable sequence parallel support for XPU - Repository: vllm-project/vllm - Author: chaojun-zhang - State: open | merged: False - Link: https://github.com/vllm-project/vllm/pull/38608 ## Description (problem / solution / changelog) ## Test Plan ## Test Result Configuration | Command | Median latency -- | -- | -- Enable SP | vllm bench latency --model meta-llama/Llama-3.1-8B -tp 2 --compilation-config '{"pass_config": {"enable_sp": true, "sp_min_token_num": 256}}' | 3.813s Disable SP | vllm bench latency --model meta-llama/Llama-3.1-8B -tp 2 | 4.074s Eager mode| vllm bench latency --model meta-llama/Llama-3.1-8B -tp 2 --enforce-eager | 5.199s UT : pytest -s -v tests/compile/correctness_e2e/test_sequence_parallel.py pytest -s -v pytest -s -v tests/compile/correctness_e2e/test_sequence_parallel.py --- Essential Elements of an Effective PR Description Checklist - [ ] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [ ] The test plan, such as providing test command. - [ ] The test results, such as pasting the results comparison before and after, or e2e results - [ ] (Optional) The necessary documentation update, such as updating `supported_models.md` and `examples` for a new model. - [ ] (Optional) Release notes update. If your change is user facing, please update the release notes draft in the [Google Doc](https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0). ## Changed files - `tests/compile/passes/distributed/test_sequence_parallelism.py` (modified, +8/-5) - `vllm/compilation/passes/fusion/sequence_parallelism.py` (modified, +36/-16) - `vllm/compilation/passes/pass_manager.py` (modified, +3/-1) --- # PR #39957: skip fp8e4b15 on xpu - Repository: vllm-project/vllm - Author: xinyu-intel - State: closed | merged: True - Link: https://github.com/vllm-project/vllm/pull/39957 ## Description (problem / solution / changelog) ## Purpose re-land https://github.com/vllm-project/vllm/pull/38479/changes/a8d08c6b29b4fb9c600ad5f654183e1349747b02#diff-318fa76eea4526501ec8f06665dbcafb711753b6ab37735cf816992ca6768094R22-R23 ## Test Plan allow test on xpu will follow https://github.com/vllm-project/vllm/issues/39158 for refactor later. ## Test Result --- Essential Elements of an Effective PR Description Checklist - [ ] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [ ] The test plan, such as providing test command. - [ ] The test results, such as pasting the results comparison before and after, or e2e results - [ ] (Optional) The necessary documentation update, such as updating `supported_models.md` and `examples` for a new model. - [ ] (Optional) Release notes update. If your change is user facing, please update the release notes draft in the [Google Doc](https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0). ## Changed files - `tests/quantization/test_turboquant.py` (modified, +17/-15) - `vllm/v1/attention/ops/triton_turboquant_decode.py` (modified, +9/-3) ### Motivation. vLLM currently supports **multiple platform targets** (CUDA, ROCm, TPU, XPU, CPU, OOT) and a growing matrix of device capabilities (SM75–SM120+). The test suite has accumulated **at least 22 distinct patterns** for platform-based skipping, totaling **~600+ scattered skip sites** across `tests/`: | Pattern | Approx. Count | |---------|--------------| | `@pytest.mark.skipif(not current_platform.is_cuda(), ...)` | ~75 | | Module-level `pytest.skip(..., allow_module_level=True)` | ~24 | | In-body `if not current_platform.is_cuda(): pytest.skip(...)` | ~362 | | `torch.cuda.is_available()` (violates platform abstraction) | ~47 | | `envs.VLLM_TARGET_DEVICE` checks | ~15 | | `has_device_capability(N)` / `is_device_capability_family(N)` | ~63 | Problems: 1. **No single source of truth** — each test author picks a different pattern. 2. **Hard to query** — CI cannot answer "which tests run on CUDA/ROCm?" without parsing arbitrary Python. 3. **Verbose and error-prone** — repetitive boilerplate in every test file. ## Goal Provide a **single, declarative, composable** way to express "this test requires platform X (and optionally capability Y)" that: - Composes with existing markers (`distributed`, `cpu_test`, `large_gpu_test`). - Requires minimal migration effort for the existing ~600 sites. - Does not break any OOT platform plugin. - Is discoverable by `pytest --collect-only` / CI tooling. ### Proposed Change. ### Option A: Decorator Factory in `tests/utils.py` **Idea:** Add `requires_platform()` and `requires_capability()` decorator factories alongside existing helpers like `large_gpu_test`, `multi_gpu_test`. ```python # tests/utils.py def

vllm2026-04-07 06:51:49

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#39158•Fetched 2026-04-08 03:01:43

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

subscribed ×19mentioned ×17commented ×2cross-referenced ×1

Error Message

Verbose and error-prone — repetitive boilerplate in every test file.

PR fix notes

PR #38608: [XPU] Enable sequence parallel support for XPU

Repository: vllm-project/vllm
Author: chaojun-zhang
State: open | merged: False
Link: https://github.com/vllm-project/vllm/pull/38608

Description (problem / solution / changelog)

Test Plan

Test Result

Configuration	Command	Median latency
Enable SP	vllm bench latency --model meta-llama/Llama-3.1-8B -tp 2 --compilation-config '{"pass_config": {"enable_sp": true, "sp_min_token_num": 256}}'	3.813s
Disable SP	vllm bench latency --model meta-llama/Llama-3.1-8B -tp 2	4.074s
Eager mode	vllm bench latency --model meta-llama/Llama-3.1-8B -tp 2 --enforce-eager	5.199s

UT : pytest -s -v tests/compile/correctness_e2e/test_sequence_parallel.py pytest -s -v pytest -s -v tests/compile/correctness_e2e/test_sequence_parallel.py

<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

</details>

Changed files

tests/compile/passes/distributed/test_sequence_parallelism.py (modified, +8/-5)
vllm/compilation/passes/fusion/sequence_parallelism.py (modified, +36/-16)
vllm/compilation/passes/pass_manager.py (modified, +3/-1)

PR #39957: skip fp8e4b15 on xpu

Repository: vllm-project/vllm
Author: xinyu-intel
State: closed | merged: True
Link: https://github.com/vllm-project/vllm/pull/39957

Description (problem / solution / changelog)

Purpose

re-land https://github.com/vllm-project/vllm/pull/38479/changes/a8d08c6b29b4fb9c600ad5f654183e1349747b02#diff-318fa76eea4526501ec8f06665dbcafb711753b6ab37735cf816992ca6768094R22-R23

Test Plan

allow test on xpu

will follow https://github.com/vllm-project/vllm/issues/39158 for refactor later.

Test Result

<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

</details>

Changed files

tests/quantization/test_turboquant.py (modified, +17/-15)
vllm/v1/attention/ops/triton_turboquant_decode.py (modified, +9/-3)

Code Example

# tests/utils.py

def requires_platform(*platforms: str):
    """
    Skip test unless current_platform matches at least one of the given platforms.
    Accepted values: "cuda", "rocm", "cpu", "xpu", "tpu", "cuda_alike" (= cuda|rocm).
    """
    _CHECKS = {
        "cuda": current_platform.is_cuda,
        "rocm": current_platform.is_rocm,
        "cpu": current_platform.is_cpu,
        "xpu": current_platform.is_xpu,
        "tpu": current_platform.is_tpu,
        "cuda_alike": current_platform.is_cuda_alike,
    }
    match = any(_CHECKS[p]() for p in platforms)
    return pytest.mark.skipif(
        not match,
        reason=f"Requires platform(s) {platforms}, "
               f"got {current_platform._enum.name}",
    )


def requires_capability(min_capability: int):
    """Skip test unless device has compute capability >= min_capability."""
    return pytest.mark.skipif(
        not current_platform.has_device_capability(min_capability),
        reason=f"Requires compute capability >= {min_capability}",
    )

---

from tests.utils import requires_platform, requires_capability

@requires_platform("cuda")
def test_cutlass_scaled_mm():
    ...

@requires_platform("cuda", "rocm")
@requires_capability(90)
def test_fp8_marlin():
    ...

# Module-level
pytestmark = requires_platform("cuda")

---

@pytest.mark.target(platform="cuda", min_capability=90)
def test_fp8_marlin():
    ...

@pytest.mark.target(platform=["cuda", "rocm"])
def test_paged_attention():
    ...

# Exclude a platform
@pytest.mark.target(exclude_platform="rocm")
def test_cuda_only_feature():
    ...

---

# tests/plugins/platform_filter.py

def pytest_collection_modifyitems(config, items):
    for item in items:
        for m in item.iter_markers("target"):
            platform = m.kwargs.get("platform")
            exclude = m.kwargs.get("exclude_platform")
            min_cap = m.kwargs.get("min_capability")

            if platform:
                platforms = [platform] if isinstance(platform, str) else platform
                if not _matches_any(platforms):
                    item.add_marker(pytest.mark.skip(...))
            if exclude:
                excludes = [exclude] if isinstance(exclude, str) else exclude
                if _matches_any(excludes):
                    item.add_marker(pytest.mark.skip(...))
            if min_cap and not current_platform.has_device_capability(min_cap):
                item.add_marker(pytest.mark.skip(...))

RAW_BUFFERClick to expand / collapse

Motivation.

vLLM currently supports multiple platform targets (CUDA, ROCm, TPU, XPU, CPU, OOT) and a growing matrix of device capabilities (SM75–SM120+). The test suite has accumulated at least 22 distinct patterns for platform-based skipping, totaling ~600+ scattered skip sites across tests/:

Pattern	Approx. Count
`@pytest.mark.skipif(not current_platform.is_cuda(), ...)`	~75
Module-level `pytest.skip(..., allow_module_level=True)`	~24
In-body `if not current_platform.is_cuda(): pytest.skip(...)`	~362
`torch.cuda.is_available()` (violates platform abstraction)	~47
`envs.VLLM_TARGET_DEVICE` checks	~15
`has_device_capability(N)` / `is_device_capability_family(N)`	~63

Problems:

No single source of truth — each test author picks a different pattern.
Hard to query — CI cannot answer "which tests run on CUDA/ROCm?" without parsing arbitrary Python.
Verbose and error-prone — repetitive boilerplate in every test file.

Goal

Provide a single, declarative, composable way to express "this test requires platform X (and optionally capability Y)" that:

Composes with existing markers (distributed, cpu_test, large_gpu_test).
Requires minimal migration effort for the existing ~600 sites.
Does not break any OOT platform plugin.
Is discoverable by pytest --collect-only / CI tooling.

Proposed Change.

Option A: Decorator Factory in `tests/utils.py`

Idea: Add requires_platform() and requires_capability() decorator factories alongside existing helpers like large_gpu_test, multi_gpu_test.

# tests/utils.py

def requires_platform(*platforms: str):
    """
    Skip test unless current_platform matches at least one of the given platforms.
    Accepted values: "cuda", "rocm", "cpu", "xpu", "tpu", "cuda_alike" (= cuda|rocm).
    """
    _CHECKS = {
        "cuda": current_platform.is_cuda,
        "rocm": current_platform.is_rocm,
        "cpu": current_platform.is_cpu,
        "xpu": current_platform.is_xpu,
        "tpu": current_platform.is_tpu,
        "cuda_alike": current_platform.is_cuda_alike,
    }
    match = any(_CHECKS[p]() for p in platforms)
    return pytest.mark.skipif(
        not match,
        reason=f"Requires platform(s) {platforms}, "
               f"got {current_platform._enum.name}",
    )


def requires_capability(min_capability: int):
    """Skip test unless device has compute capability >= min_capability."""
    return pytest.mark.skipif(
        not current_platform.has_device_capability(min_capability),
        reason=f"Requires compute capability >= {min_capability}",
    )

Usage:

from tests.utils import requires_platform, requires_capability

@requires_platform("cuda")
def test_cutlass_scaled_mm():
    ...

@requires_platform("cuda", "rocm")
@requires_capability(90)
def test_fp8_marlin():
    ...

# Module-level
pytestmark = requires_platform("cuda")

Pros	Cons
Drop-in, no conftest hook changes needed	Not discoverable by `pytest -m`, needs execution to skip
Familiar pattern (like `@large_gpu_test`)	Multiple `requires_*` and `@pytest.mark.skipif` patterns still coexist
Very easy to implement	Cannot do `pytest --collect-only -m cuda` for CI filtering
Backward-compatible — old code keeps working	Essentially a nicer wrapper, doesn't unify the existing scattered `pytest.skip()`

Option B: Custom Pytest Plugin with `@target` Marker (Enum-Based)

Idea: A standalone pytest plugin (could live in tests/plugins/ or conftest.py) that introduces a single @pytest.mark.target(...) marker accepting platform enum values and capability constraints.

@pytest.mark.target(platform="cuda", min_capability=90)
def test_fp8_marlin():
    ...

@pytest.mark.target(platform=["cuda", "rocm"])
def test_paged_attention():
    ...

# Exclude a platform
@pytest.mark.target(exclude_platform="rocm")
def test_cuda_only_feature():
    ...

The plugin hook reads each item's target marker and decides skip/run:

# tests/plugins/platform_filter.py

def pytest_collection_modifyitems(config, items):
    for item in items:
        for m in item.iter_markers("target"):
            platform = m.kwargs.get("platform")
            exclude = m.kwargs.get("exclude_platform")
            min_cap = m.kwargs.get("min_capability")

            if platform:
                platforms = [platform] if isinstance(platform, str) else platform
                if not _matches_any(platforms):
                    item.add_marker(pytest.mark.skip(...))
            if exclude:
                excludes = [exclude] if isinstance(exclude, str) else exclude
                if _matches_any(excludes):
                    item.add_marker(pytest.mark.skip(...))
            if min_cap and not current_platform.has_device_capability(min_cap):
                item.add_marker(pytest.mark.skip(...))

Pros	Cons
Single marker covers platform + capability + exclusion	Slightly more complex marker API
Queryable: `pytest -m "target"`	Migration from scattered patterns is still manual
Extensible: can add `min_mem_gb`, `num_gpus` to unify all skip logic	Over-engineering risk: one marker to rule them all
Supports exclusion natively (`exclude_platform="rocm"`)	Parametrized marker kwargs are harder to filter in CLI

Feedback Period.

1-2 weeks

CC List.

@youkaichao @WoosukKwon @simon-mo @ywang96 @robertgshaw2-redhat @DarkLight1337 @Isotr0py @tjtanaa @gshtras @khluu @ProExpertProg @bigPYJ1151 @Yikun @wangxiyuan @yaochengji @PatrykWo

Any Other Things.

Suggested Migration Plan

Phase	Scope	Action
Phase 0	Infra	Register markers, add conftest hook, add `requires_platform` / `requires_capability` to `tests/utils.py`
Phase 1	New tests	Require new tests to use markers or decorators (enforced via PR review / lint)
Phase 2	Module-level skips	Convert the ~24 `pytest.skip(allow_module_level=True)` to `pytestmark = pytest.mark.<platform>`
Phase 3	`@pytest.mark.skipif`	Convert ~75 `skipif(not current_platform.is_*())` to `@pytest.mark.<platform>`
Phase 4	In-body skips	Convert ~362 `if not current_platform.is_*(): pytest.skip()` to markers (largest effort)
Phase 5	Cleanup	Remove `torch.cuda.is_available()` patterns (~47), replace with markers

Pre-commit Lint Rule (Phase 1+)

Add a hook that warns on new current_platform.is_cuda() + pytest.skip patterns in test files, suggesting the marker-based approach instead.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

Implement a single, declarative, composable way to express platform and capability requirements for tests using either a decorator factory or a custom Pytest plugin.

Guidance

Choose an approach: Decide between the decorator factory (requires_platform and requires_capability) and the custom Pytest plugin (@target marker) based on the trade-offs outlined in the issue.
Implement the chosen approach: If using the decorator factory, add the requires_platform and requires_capability functions to tests/utils.py. If using the custom Pytest plugin, create a new plugin file (e.g., tests/plugins/platform_filter.py) and implement the pytest_collection_modifyitems hook.
Update test code: Gradually migrate existing tests to use the new approach, following the suggested migration plan (Phases 0-5).
Add pre-commit lint rule: Create a hook to warn against new instances of current_platform.is_cuda() + pytest.skip patterns in test files, suggesting the marker-based approach instead.

Example

# Using the decorator factory
from tests.utils import requires_platform, requires_capability

@requires_platform("cuda")
def test_cutlass_scaled_mm():
    ...

# Using the custom Pytest plugin
@pytest.mark.target(platform="cuda", min_capability=90)
def test_fp8_marlin():
    ...

Notes

The choice between the decorator factory and the custom Pytest plugin depends on the specific needs and preferences of the project. The decorator factory provides a more straightforward implementation, while the custom Pytest plugin offers more flexibility and queryability.

Recommendation

Apply the decorator factory approach, as it is easier to implement and provides a more familiar pattern for test authors, while still addressing the main issues of verbosity and error-proneness.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #network issue #logging issue #authentication issue #prompt issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

vllm - ✅(Solved) Fix [RFC][Test]: Unified Platform-Aware Test Skip Mechanism [3 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #38608: [XPU] Enable sequence parallel support for XPU

Description (problem / solution / changelog)

Test Plan

Test Result

Changed files

PR #39957: skip fp8e4b15 on xpu

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

Motivation.

Goal

Proposed Change.

Option A: Decorator Factory in tests/utils.py

Option B: Custom Pytest Plugin with @target Marker (Enum-Based)

Feedback Period.

CC List.

Any Other Things.

Suggested Migration Plan

Pre-commit Lint Rule (Phase 1+)

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Option A: Decorator Factory in `tests/utils.py`

Option B: Custom Pytest Plugin with `@target` Marker (Enum-Based)