pytorch - 💡(How to fix) Fix [vllm] [triton 3.7] PassManager::run failed in make_ttgir when compiling vLLM _penalties_kernel

pytorch2026-04-20 20:40:56

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

With triton==3.7.0 bundled in torch 2.12.0, compiling vLLM's _penalties_kernel (a Triton JIT kernel used in sampling) fails in Triton's make_ttgir pass:

RuntimeError: PassManager::run failed

Same kernel compiles on triton 3.6 (torch 2.11). Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Error Message

File ".../vllm/v1/worker/gpu/sample/penalties.py", line 207, in apply_penalties _penalties_kernel(num_tokens, num_blocks) File ".../triton/runtime/jit.py", line 739, in run kernel = self._do_compile(key, signature, device, constexprs, options, attrs, warmup) File ".../triton/runtime/jit.py", line 884, in _do_compile kernel = self.compile(src, target=target, options=options.dict) File ".../triton/compiler/compiler.py", line 327, in compile next_module = compile_ir(module, metadata) File ".../triton/backends/nvidia/compiler.py", line 549, in <lambda> stages["ttgir"] = lambda src, metadata: self.make_ttgir(src, metadata, options, capability) File ".../triton/backends/nvidia/compiler.py", line 321, in make_ttgir pm.run(mod, 'make_ttgir') RuntimeError: PassManager::run failed

Root Cause

With triton==3.7.0 bundled in torch 2.12.0, compiling vLLM's _penalties_kernel (a Triton JIT kernel used in sampling) fails in Triton's make_ttgir pass:

RuntimeError: PassManager::run failed

Same kernel compiles on triton 3.6 (torch 2.11). Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Code Example

File ".../vllm/v1/worker/gpu/sample/penalties.py", line 207, in apply_penalties
    _penalties_kernel[(num_tokens, num_blocks)](...)
File ".../triton/runtime/jit.py", line 739, in run
    kernel = self._do_compile(key, signature, device, constexprs, options, attrs, warmup)
File ".../triton/runtime/jit.py", line 884, in _do_compile
    kernel = self.compile(src, target=target, options=options.__dict__)
File ".../triton/compiler/compiler.py", line 327, in compile
    next_module = compile_ir(module, metadata)
File ".../triton/backends/nvidia/compiler.py", line 549, in <lambda>
    stages["ttgir"] = lambda src, metadata: self.make_ttgir(src, metadata, options, capability)
File ".../triton/backends/nvidia/compiler.py", line 321, in make_ttgir
    pm.run(mod, 'make_ttgir')
RuntimeError: PassManager::run failed

RAW_BUFFERClick to expand / collapse

Summary

With triton==3.7.0 bundled in torch 2.12.0, compiling vLLM's _penalties_kernel (a Triton JIT kernel used in sampling) fails in Triton's make_ttgir pass:

RuntimeError: PassManager::run failed

Same kernel compiles on triton 3.6 (torch 2.11). Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Environment

torch: 2.12.0+cu130 (test channel)
triton: 3.7.0
CUDA: 13.0
Python: 3.12
GPU: H100

Reproduction / traceback

From a vLLM CI run (Model Runner V2 Spec Decode, 2 H100):

File ".../vllm/v1/worker/gpu/sample/penalties.py", line 207, in apply_penalties
    _penalties_kernel[(num_tokens, num_blocks)](...)
File ".../triton/runtime/jit.py", line 739, in run
    kernel = self._do_compile(key, signature, device, constexprs, options, attrs, warmup)
File ".../triton/runtime/jit.py", line 884, in _do_compile
    kernel = self.compile(src, target=target, options=options.__dict__)
File ".../triton/compiler/compiler.py", line 327, in compile
    next_module = compile_ir(module, metadata)
File ".../triton/backends/nvidia/compiler.py", line 549, in <lambda>
    stages["ttgir"] = lambda src, metadata: self.make_ttgir(src, metadata, options, capability)
File ".../triton/backends/nvidia/compiler.py", line 321, in make_ttgir
    pm.run(mod, 'make_ttgir')
RuntimeError: PassManager::run failed

Request

PassManager prints no diagnostic; would be useful to know how to surface the failing MLIR pass. Happy to extract the offending TTIR dump if someone can point at the right TRITON_* env var for 3.7.

Source of the kernel: https://github.com/vllm-project/vllm/blob/main/vllm/v1/worker/gpu/sample/penalties.py

Affected tests

tests/v1/spec_decode/* via "Model Runner V2 Spec Decode" — 3 failed tests, all on the same EngineCore init path.

extent analysis

TL;DR

Downgrade triton to version 3.6.0 to potentially resolve the compilation failure of the _penalties_kernel in vLLM.

Guidance

Verify that the issue is indeed caused by the triton version by testing with triton 3.6.0, as the same kernel compiles successfully with this version.
Check the TRITON_* environment variables to see if there are any options to increase the verbosity of the PassManager to get more diagnostic information about the failing MLIR pass.
Extract the offending TTIR dump using the correct TRITON_* env var to further investigate the issue.
Review the changes between triton 3.6.0 and 3.7.0 to identify potential breaking changes that could be causing the compilation failure.

Example

No code snippet is provided as the issue is related to a specific version of triton and its interaction with the vLLM codebase.

Notes

The root cause of the issue is unclear, but it seems to be related to the make_ttgir pass in triton 3.7.0. Downgrading to triton 3.6.0 may be a temporary workaround, but a more permanent solution would require further investigation into the changes between the two versions.

Recommendation

Apply the workaround by downgrading triton to version 3.6.0, as this has been shown to resolve the compilation failure in the past. This will allow the vLLM project to proceed with the torch 2.12 upgrade while a more permanent solution is investigated.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#installation #tensor shape #autograd error #model save/load #optimization

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix [vllm] [triton 3.7] PassManager::run failed in make_ttgir when compiling vLLM _penalties_kernel

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Environment

Reproduction / traceback

Request

Affected tests

Links

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix [vllm] [triton 3.7] PassManager::run failed in make_ttgir when compiling vLLM _penalties_kernel

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Environment

Reproduction / traceback

Request

Affected tests

Links

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING