vllm - 💡(How to fix) Fix [torch.compile] Proper PassManager infrastructure [1 participants]

vllm2026-04-08 22:23:27

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#39356•Fetched 2026-04-09 07:51:41

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ProExpertProg

Participants

ProExpertProg

Timeline (top)

project_v2_item_status_changed ×3added_to_project_v2 ×2labeled ×2converted_from_draft ×1

RAW_BUFFERClick to expand / collapse

Motivation

Currently, we bundle custom torch.compile passes using the PostGradPassManager. This manager is set as the current_platform.pass_key (default: "post_grad_custom_post_pass"), and individual passes within it are enabled/disabled using individual flags. This lacks scalability and control: every additional pass requires a new flag, which we often skip for utility passes (leaving no way to disable them). Platforms also have to choose a single pass_key as opposed to plugging into multiple points in the PyTorch compilation pipeline.

This inhibits debugging from the CLI, requiring manual code modifications. It also requires platforms to reimplement a lot of the pass management (enablement, cache hash, etc.). Another example is vLLM-Spyre, which currently plugs into "post_grad_custom_post_pass" for custom codegen transformations, colliding with the default vLLM pass manager.

Proposal

We should create a base PassManager class, which can handle invoking multiple passes and properly exposes a uuid() method to Inductor/AOTAutograd caching. We should also add a way to manually specify the full list of passes on the command line/in Python. Current flags can stick around and be used to produce this list by default. Passes should also be refactored to use 0-arg constructors by default (and read VllmConfig using get_current_vllm_config()) - that way OOT passes can integrate into this list more easily. This will also allow IR passes to move into the vllm/ir folder and remove their dependence on vLLM.

Timeline

This is not urgent and should probably wait until torch 2.12 when pre_grad_custom_pass becomes a proper CustomGraphPass. Depending on interest from OOT backends, this can be prioritized earlier.

extent analysis

TL;DR

Refactor the pass management system by introducing a base PassManager class to improve scalability and control over custom torch.compile passes.

Guidance

Introduce a base PassManager class to handle invoking multiple passes and expose a uuid() method for caching.
Add a command-line or Python interface to manually specify the full list of passes.
Refactor passes to use 0-arg constructors and read VllmConfig using get_current_vllm_config() for easier integration.
Consider prioritizing this change based on interest from OOT backends and the upcoming torch 2.12 release.

Example

No specific code snippet is provided due to the lack of implementation details in the issue.

Notes

The proposed solution aims to address the current limitations of the pass management system, but its implementation details and potential impact on existing codebases are not fully specified.

Recommendation

Apply workaround: Refactor the pass management system as proposed, considering the potential benefits of improved scalability and control over custom torch.compile passes. This approach allows for a more flexible and maintainable solution, aligning with the upcoming torch 2.12 release and potential interest from OOT backends.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#logging issue #authentication issue #prompt issue #agent setup #task chaining

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - 💡(How to fix) Fix [torch.compile] Proper PassManager infrastructure [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Motivation

Proposal

Timeline

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

vllm - 💡(How to fix) Fix [torch.compile] Proper PassManager infrastructure [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Motivation

Proposal

Timeline

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING