vllm - ✅(Solved) Fix [RFC][Test]: Unified Platform-Aware Test Skip Mechanism [3 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#39158Fetched 2026-04-08 03:01:43
View on GitHub
Comments
2
Participants
3
Timeline
40
Reactions
6
Timeline (top)
subscribed ×19mentioned ×17commented ×2cross-referenced ×1

Error Message

  1. Verbose and error-prone — repetitive boilerplate in every test file.

PR fix notes

PR #38608: [XPU] Enable sequence parallel support for XPU

Description (problem / solution / changelog)

Test Plan

Test Result

ConfigurationCommandMedian latency
Enable  SPvllm bench latency --model meta-llama/Llama-3.1-8B -tp 2 --compilation-config '{"pass_config": {"enable_sp": true, "sp_min_token_num": 256}}'3.813s
Disable SPvllm bench latency --model meta-llama/Llama-3.1-8B -tp 24.074s
Eager modevllm bench latency --model meta-llama/Llama-3.1-8B -tp 2 --enforce-eager5.199s

UT : pytest -s -v tests/compile/correctness_e2e/test_sequence_parallel.py pytest -s -v pytest -s -v tests/compile/correctness_e2e/test_sequence_parallel.py


<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
</details>

Changed files

  • tests/compile/passes/distributed/test_sequence_parallelism.py (modified, +8/-5)
  • vllm/compilation/passes/fusion/sequence_parallelism.py (modified, +36/-16)
  • vllm/compilation/passes/pass_manager.py (modified, +3/-1)

PR #39957: skip fp8e4b15 on xpu

Description (problem / solution / changelog)

Purpose

re-land https://github.com/vllm-project/vllm/pull/38479/changes/a8d08c6b29b4fb9c600ad5f654183e1349747b02#diff-318fa76eea4526501ec8f06665dbcafb711753b6ab37735cf816992ca6768094R22-R23

Test Plan

allow test on xpu

will follow https://github.com/vllm-project/vllm/issues/39158 for refactor later.

Test Result


<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
</details>

Changed files

  • tests/quantization/test_turboquant.py (modified, +17/-15)
  • vllm/v1/attention/ops/triton_turboquant_decode.py (modified, +9/-3)

Code Example

# tests/utils.py

def requires_platform(*platforms: str):
    """
    Skip test unless current_platform matches at least one of the given platforms.
    Accepted values: "cuda", "rocm", "cpu", "xpu", "tpu", "cuda_alike" (= cuda|rocm).
    """
    _CHECKS = {
        "cuda": current_platform.is_cuda,
        "rocm": current_platform.is_rocm,
        "cpu": current_platform.is_cpu,
        "xpu": current_platform.is_xpu,
        "tpu": current_platform.is_tpu,
        "cuda_alike": current_platform.is_cuda_alike,
    }
    match = any(_CHECKS[p]() for p in platforms)
    return pytest.mark.skipif(
        not match,
        reason=f"Requires platform(s) {platforms}, "
               f"got {current_platform._enum.name}",
    )


def requires_capability(min_capability: int):
    """Skip test unless device has compute capability >= min_capability."""
    return pytest.mark.skipif(
        not current_platform.has_device_capability(min_capability),
        reason=f"Requires compute capability >= {min_capability}",
    )

---

from tests.utils import requires_platform, requires_capability

@requires_platform("cuda")
def test_cutlass_scaled_mm():
    ...

@requires_platform("cuda", "rocm")
@requires_capability(90)
def test_fp8_marlin():
    ...

# Module-level
pytestmark = requires_platform("cuda")

---

@pytest.mark.target(platform="cuda", min_capability=90)
def test_fp8_marlin():
    ...

@pytest.mark.target(platform=["cuda", "rocm"])
def test_paged_attention():
    ...

# Exclude a platform
@pytest.mark.target(exclude_platform="rocm")
def test_cuda_only_feature():
    ...

---

# tests/plugins/platform_filter.py

def pytest_collection_modifyitems(config, items):
    for item in items:
        for m in item.iter_markers("target"):
            platform = m.kwargs.get("platform")
            exclude = m.kwargs.get("exclude_platform")
            min_cap = m.kwargs.get("min_capability")

            if platform:
                platforms = [platform] if isinstance(platform, str) else platform
                if not _matches_any(platforms):
                    item.add_marker(pytest.mark.skip(...))
            if exclude:
                excludes = [exclude] if isinstance(exclude, str) else exclude
                if _matches_any(excludes):
                    item.add_marker(pytest.mark.skip(...))
            if min_cap and not current_platform.has_device_capability(min_cap):
                item.add_marker(pytest.mark.skip(...))
RAW_BUFFERClick to expand / collapse

Motivation.

vLLM currently supports multiple platform targets (CUDA, ROCm, TPU, XPU, CPU, OOT) and a growing matrix of device capabilities (SM75–SM120+). The test suite has accumulated at least 22 distinct patterns for platform-based skipping, totaling ~600+ scattered skip sites across tests/:

PatternApprox. Count
@pytest.mark.skipif(not current_platform.is_cuda(), ...)~75
Module-level pytest.skip(..., allow_module_level=True)~24
In-body if not current_platform.is_cuda(): pytest.skip(...)~362
torch.cuda.is_available() (violates platform abstraction)~47
envs.VLLM_TARGET_DEVICE checks~15
has_device_capability(N) / is_device_capability_family(N)~63

Problems:

  1. No single source of truth — each test author picks a different pattern.
  2. Hard to query — CI cannot answer "which tests run on CUDA/ROCm?" without parsing arbitrary Python.
  3. Verbose and error-prone — repetitive boilerplate in every test file.

Goal

Provide a single, declarative, composable way to express "this test requires platform X (and optionally capability Y)" that:

  • Composes with existing markers (distributed, cpu_test, large_gpu_test).
  • Requires minimal migration effort for the existing ~600 sites.
  • Does not break any OOT platform plugin.
  • Is discoverable by pytest --collect-only / CI tooling.

Proposed Change.

Option A: Decorator Factory in tests/utils.py

Idea: Add requires_platform() and requires_capability() decorator factories alongside existing helpers like large_gpu_test, multi_gpu_test.

# tests/utils.py

def requires_platform(*platforms: str):
    """
    Skip test unless current_platform matches at least one of the given platforms.
    Accepted values: "cuda", "rocm", "cpu", "xpu", "tpu", "cuda_alike" (= cuda|rocm).
    """
    _CHECKS = {
        "cuda": current_platform.is_cuda,
        "rocm": current_platform.is_rocm,
        "cpu": current_platform.is_cpu,
        "xpu": current_platform.is_xpu,
        "tpu": current_platform.is_tpu,
        "cuda_alike": current_platform.is_cuda_alike,
    }
    match = any(_CHECKS[p]() for p in platforms)
    return pytest.mark.skipif(
        not match,
        reason=f"Requires platform(s) {platforms}, "
               f"got {current_platform._enum.name}",
    )


def requires_capability(min_capability: int):
    """Skip test unless device has compute capability >= min_capability."""
    return pytest.mark.skipif(
        not current_platform.has_device_capability(min_capability),
        reason=f"Requires compute capability >= {min_capability}",
    )

Usage:

from tests.utils import requires_platform, requires_capability

@requires_platform("cuda")
def test_cutlass_scaled_mm():
    ...

@requires_platform("cuda", "rocm")
@requires_capability(90)
def test_fp8_marlin():
    ...

# Module-level
pytestmark = requires_platform("cuda")
ProsCons
Drop-in, no conftest hook changes neededNot discoverable by pytest -m, needs execution to skip
Familiar pattern (like @large_gpu_test)Multiple requires_* and @pytest.mark.skipif patterns still coexist
Very easy to implementCannot do pytest --collect-only -m cuda for CI filtering
Backward-compatible — old code keeps workingEssentially a nicer wrapper, doesn't unify the existing scattered pytest.skip()

Option B: Custom Pytest Plugin with @target Marker (Enum-Based)

Idea: A standalone pytest plugin (could live in tests/plugins/ or conftest.py) that introduces a single @pytest.mark.target(...) marker accepting platform enum values and capability constraints.

@pytest.mark.target(platform="cuda", min_capability=90)
def test_fp8_marlin():
    ...

@pytest.mark.target(platform=["cuda", "rocm"])
def test_paged_attention():
    ...

# Exclude a platform
@pytest.mark.target(exclude_platform="rocm")
def test_cuda_only_feature():
    ...

The plugin hook reads each item's target marker and decides skip/run:

# tests/plugins/platform_filter.py

def pytest_collection_modifyitems(config, items):
    for item in items:
        for m in item.iter_markers("target"):
            platform = m.kwargs.get("platform")
            exclude = m.kwargs.get("exclude_platform")
            min_cap = m.kwargs.get("min_capability")

            if platform:
                platforms = [platform] if isinstance(platform, str) else platform
                if not _matches_any(platforms):
                    item.add_marker(pytest.mark.skip(...))
            if exclude:
                excludes = [exclude] if isinstance(exclude, str) else exclude
                if _matches_any(excludes):
                    item.add_marker(pytest.mark.skip(...))
            if min_cap and not current_platform.has_device_capability(min_cap):
                item.add_marker(pytest.mark.skip(...))
ProsCons
Single marker covers platform + capability + exclusionSlightly more complex marker API
Queryable: pytest -m "target"Migration from scattered patterns is still manual
Extensible: can add min_mem_gb, num_gpus to unify all skip logicOver-engineering risk: one marker to rule them all
Supports exclusion natively (exclude_platform="rocm")Parametrized marker kwargs are harder to filter in CLI

Feedback Period.

1-2 weeks

CC List.

@youkaichao @WoosukKwon @simon-mo @ywang96 @robertgshaw2-redhat @DarkLight1337 @Isotr0py @tjtanaa @gshtras @khluu @ProExpertProg @bigPYJ1151 @Yikun @wangxiyuan @yaochengji @PatrykWo

Any Other Things.

Suggested Migration Plan

PhaseScopeAction
Phase 0InfraRegister markers, add conftest hook, add requires_platform / requires_capability to tests/utils.py
Phase 1New testsRequire new tests to use markers or decorators (enforced via PR review / lint)
Phase 2Module-level skipsConvert the ~24 pytest.skip(allow_module_level=True) to pytestmark = pytest.mark.<platform>
Phase 3@pytest.mark.skipifConvert ~75 skipif(not current_platform.is_*()) to @pytest.mark.<platform>
Phase 4In-body skipsConvert ~362 if not current_platform.is_*(): pytest.skip() to markers (largest effort)
Phase 5CleanupRemove torch.cuda.is_available() patterns (~47), replace with markers

Pre-commit Lint Rule (Phase 1+)

Add a hook that warns on new current_platform.is_cuda() + pytest.skip patterns in test files, suggesting the marker-based approach instead.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

Implement a single, declarative, composable way to express platform and capability requirements for tests using either a decorator factory or a custom Pytest plugin.

Guidance

  1. Choose an approach: Decide between the decorator factory (requires_platform and requires_capability) and the custom Pytest plugin (@target marker) based on the trade-offs outlined in the issue.
  2. Implement the chosen approach: If using the decorator factory, add the requires_platform and requires_capability functions to tests/utils.py. If using the custom Pytest plugin, create a new plugin file (e.g., tests/plugins/platform_filter.py) and implement the pytest_collection_modifyitems hook.
  3. Update test code: Gradually migrate existing tests to use the new approach, following the suggested migration plan (Phases 0-5).
  4. Add pre-commit lint rule: Create a hook to warn against new instances of current_platform.is_cuda() + pytest.skip patterns in test files, suggesting the marker-based approach instead.

Example

# Using the decorator factory
from tests.utils import requires_platform, requires_capability

@requires_platform("cuda")
def test_cutlass_scaled_mm():
    ...

# Using the custom Pytest plugin
@pytest.mark.target(platform="cuda", min_capability=90)
def test_fp8_marlin():
    ...

Notes

The choice between the decorator factory and the custom Pytest plugin depends on the specific needs and preferences of the project. The decorator factory provides a more straightforward implementation, while the custom Pytest plugin offers more flexibility and queryability.

Recommendation

Apply the decorator factory approach, as it is easier to implement and provides a more familiar pattern for test authors, while still addressing the main issues of verbosity and error-proneness.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING