vllm - 💡(How to fix) Fix [torch.compile] Proper PassManager infrastructure [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#39356Fetched 2026-04-09 07:51:41
View on GitHub
Comments
0
Participants
1
Timeline
8
Reactions
0
Participants
Timeline (top)
project_v2_item_status_changed ×3added_to_project_v2 ×2labeled ×2converted_from_draft ×1
RAW_BUFFERClick to expand / collapse

Motivation

Currently, we bundle custom torch.compile passes using the PostGradPassManager. This manager is set as the current_platform.pass_key (default: "post_grad_custom_post_pass"), and individual passes within it are enabled/disabled using individual flags. This lacks scalability and control: every additional pass requires a new flag, which we often skip for utility passes (leaving no way to disable them). Platforms also have to choose a single pass_key as opposed to plugging into multiple points in the PyTorch compilation pipeline.

This inhibits debugging from the CLI, requiring manual code modifications. It also requires platforms to reimplement a lot of the pass management (enablement, cache hash, etc.). Another example is vLLM-Spyre, which currently plugs into "post_grad_custom_post_pass" for custom codegen transformations, colliding with the default vLLM pass manager.

Proposal

We should create a base PassManager class, which can handle invoking multiple passes and properly exposes a uuid() method to Inductor/AOTAutograd caching. We should also add a way to manually specify the full list of passes on the command line/in Python. Current flags can stick around and be used to produce this list by default. Passes should also be refactored to use 0-arg constructors by default (and read VllmConfig using get_current_vllm_config()) - that way OOT passes can integrate into this list more easily. This will also allow IR passes to move into the vllm/ir folder and remove their dependence on vLLM.

Timeline

This is not urgent and should probably wait until torch 2.12 when pre_grad_custom_pass becomes a proper CustomGraphPass. Depending on interest from OOT backends, this can be prioritized earlier.

extent analysis

TL;DR

Refactor the pass management system by introducing a base PassManager class to improve scalability and control over custom torch.compile passes.

Guidance

  • Introduce a base PassManager class to handle invoking multiple passes and expose a uuid() method for caching.
  • Add a command-line or Python interface to manually specify the full list of passes.
  • Refactor passes to use 0-arg constructors and read VllmConfig using get_current_vllm_config() for easier integration.
  • Consider prioritizing this change based on interest from OOT backends and the upcoming torch 2.12 release.

Example

No specific code snippet is provided due to the lack of implementation details in the issue.

Notes

The proposed solution aims to address the current limitations of the pass management system, but its implementation details and potential impact on existing codebases are not fully specified.

Recommendation

Apply workaround: Refactor the pass management system as proposed, considering the potential benefits of improved scalability and control over custom torch.compile passes. This approach allows for a more flexible and maintainable solution, aligning with the upcoming torch 2.12 release and potential interest from OOT backends.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING