vllm - 💡(How to fix) Fix [vLLM IR] Op test & benchmark infra [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#38782Fetched 2026-04-08 02:22:44
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Assignees
Timeline (top)
assigned ×1commented ×1issue_type_added ×1labeled ×1
RAW_BUFFERClick to expand / collapse

We should create infra & utilities to make it very easy to add tests & benchmarks for new ops. I think we should have tests for every op that test all supported providers and check their semantics match native and that native semantics are roughly as expected (e.g. norm(2x) ~ norm(x)). Also we should opcheck each provider to check args are not mutated and outputs do not alias inputs (except in the maybe_inplace case).

Then we should also create utilities such that each provider can write their own tests comparing to native if they want, especially validating that supports_args works as intended. Not sure how to reduce duplication here yet.

For benchmarks, we should have a single driver file, and each op should register a get_inputs(case: BenchmarkCase) -> list[torch.Tensor] function (BenchmarkCase(num_tokens: int, hidden_size: int)). And then we should have a fixed set of benchmark cases corresponding to hidden_sizes from common models. This should be enough to start, and then later ops can extend BenchmarkCase to add more parameters (e.g. group_size for group quant).

extent analysis

TL;DR

Create infrastructure and utilities to simplify adding tests and benchmarks for new operations, including tests for provider semantics and benchmarks with a standardized driver file.

Guidance

  • Develop a test framework that checks the semantics of each operation across all supported providers, ensuring consistency with native implementations.
  • Design a utility for providers to write custom tests, particularly for validating supports_args functionality, to reduce duplication.
  • Establish a benchmarking system with a single driver file, where each operation registers a get_inputs function for generating benchmark inputs based on BenchmarkCase parameters.
  • Define a set of standard BenchmarkCase parameters, such as hidden_size, to ensure consistent benchmarking across operations.

Example

class BenchmarkCase:
    def __init__(self, num_tokens: int, hidden_size: int):
        self.num_tokens = num_tokens
        self.hidden_size = hidden_size

def get_inputs(case: BenchmarkCase) -> list[torch.Tensor]:
    # Operation-specific implementation to generate inputs based on BenchmarkCase
    pass

Notes

The proposed solution focuses on creating a structured approach to testing and benchmarking operations. However, the exact implementation details, such as how to reduce duplication in provider tests, are left to be determined.

Recommendation

Apply workaround by starting with the proposed infrastructure and utilities, and iteratively refine them based on the needs of each operation and provider, as the issue suggests a phased approach to developing these tests and benchmarks.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - 💡(How to fix) Fix [vLLM IR] Op test & benchmark infra [1 comments, 2 participants]