pytorch - 💡(How to fix) Fix [vllm] [triton 3.7] PassManager::run failed in make_ttgir when compiling vLLM _penalties_kernel

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

With triton==3.7.0 bundled in torch 2.12.0, compiling vLLM's _penalties_kernel (a Triton JIT kernel used in sampling) fails in Triton's make_ttgir pass:

RuntimeError: PassManager::run failed

Same kernel compiles on triton 3.6 (torch 2.11). Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Error Message

File ".../vllm/v1/worker/gpu/sample/penalties.py", line 207, in apply_penalties _penalties_kernel(num_tokens, num_blocks) File ".../triton/runtime/jit.py", line 739, in run kernel = self._do_compile(key, signature, device, constexprs, options, attrs, warmup) File ".../triton/runtime/jit.py", line 884, in _do_compile kernel = self.compile(src, target=target, options=options.dict) File ".../triton/compiler/compiler.py", line 327, in compile next_module = compile_ir(module, metadata) File ".../triton/backends/nvidia/compiler.py", line 549, in <lambda> stages["ttgir"] = lambda src, metadata: self.make_ttgir(src, metadata, options, capability) File ".../triton/backends/nvidia/compiler.py", line 321, in make_ttgir pm.run(mod, 'make_ttgir') RuntimeError: PassManager::run failed

Root Cause

With triton==3.7.0 bundled in torch 2.12.0, compiling vLLM's _penalties_kernel (a Triton JIT kernel used in sampling) fails in Triton's make_ttgir pass:

RuntimeError: PassManager::run failed

Same kernel compiles on triton 3.6 (torch 2.11). Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Code Example

File ".../vllm/v1/worker/gpu/sample/penalties.py", line 207, in apply_penalties
    _penalties_kernel[(num_tokens, num_blocks)](...)
File ".../triton/runtime/jit.py", line 739, in run
    kernel = self._do_compile(key, signature, device, constexprs, options, attrs, warmup)
File ".../triton/runtime/jit.py", line 884, in _do_compile
    kernel = self.compile(src, target=target, options=options.__dict__)
File ".../triton/compiler/compiler.py", line 327, in compile
    next_module = compile_ir(module, metadata)
File ".../triton/backends/nvidia/compiler.py", line 549, in <lambda>
    stages["ttgir"] = lambda src, metadata: self.make_ttgir(src, metadata, options, capability)
File ".../triton/backends/nvidia/compiler.py", line 321, in make_ttgir
    pm.run(mod, 'make_ttgir')
RuntimeError: PassManager::run failed
RAW_BUFFERClick to expand / collapse

Summary

With triton==3.7.0 bundled in torch 2.12.0, compiling vLLM's _penalties_kernel (a Triton JIT kernel used in sampling) fails in Triton's make_ttgir pass:

RuntimeError: PassManager::run failed

Same kernel compiles on triton 3.6 (torch 2.11). Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Environment

  • torch: 2.12.0+cu130 (test channel)
  • triton: 3.7.0
  • CUDA: 13.0
  • Python: 3.12
  • GPU: H100

Reproduction / traceback

From a vLLM CI run (Model Runner V2 Spec Decode, 2 H100):

File ".../vllm/v1/worker/gpu/sample/penalties.py", line 207, in apply_penalties
    _penalties_kernel[(num_tokens, num_blocks)](...)
File ".../triton/runtime/jit.py", line 739, in run
    kernel = self._do_compile(key, signature, device, constexprs, options, attrs, warmup)
File ".../triton/runtime/jit.py", line 884, in _do_compile
    kernel = self.compile(src, target=target, options=options.__dict__)
File ".../triton/compiler/compiler.py", line 327, in compile
    next_module = compile_ir(module, metadata)
File ".../triton/backends/nvidia/compiler.py", line 549, in <lambda>
    stages["ttgir"] = lambda src, metadata: self.make_ttgir(src, metadata, options, capability)
File ".../triton/backends/nvidia/compiler.py", line 321, in make_ttgir
    pm.run(mod, 'make_ttgir')
RuntimeError: PassManager::run failed

Request

PassManager prints no diagnostic; would be useful to know how to surface the failing MLIR pass. Happy to extract the offending TTIR dump if someone can point at the right TRITON_* env var for 3.7.

Source of the kernel: https://github.com/vllm-project/vllm/blob/main/vllm/v1/worker/gpu/sample/penalties.py

Affected tests

  • tests/v1/spec_decode/* via "Model Runner V2 Spec Decode" — 3 failed tests, all on the same EngineCore init path.

Links

cc @chauhang @penguinwu

extent analysis

TL;DR

Downgrade triton to version 3.6.0 to potentially resolve the compilation failure of the _penalties_kernel in vLLM.

Guidance

  • Verify that the issue is indeed caused by the triton version by testing with triton 3.6.0, as the same kernel compiles successfully with this version.
  • Check the TRITON_* environment variables to see if there are any options to increase the verbosity of the PassManager to get more diagnostic information about the failing MLIR pass.
  • Extract the offending TTIR dump using the correct TRITON_* env var to further investigate the issue.
  • Review the changes between triton 3.6.0 and 3.7.0 to identify potential breaking changes that could be causing the compilation failure.

Example

No code snippet is provided as the issue is related to a specific version of triton and its interaction with the vLLM codebase.

Notes

The root cause of the issue is unclear, but it seems to be related to the make_ttgir pass in triton 3.7.0. Downgrading to triton 3.6.0 may be a temporary workaround, but a more permanent solution would require further investigation into the changes between the two versions.

Recommendation

Apply the workaround by downgrading triton to version 3.6.0, as this has been shown to resolve the compilation failure in the past. This will allow the vLLM project to proceed with the torch 2.12 upgrade while a more permanent solution is investigated.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - 💡(How to fix) Fix [vllm] [triton 3.7] PassManager::run failed in make_ttgir when compiling vLLM _penalties_kernel