pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][Inductor] warm_artifacts_saved == 0 and KeyError(None) in standalone_compile startup tests

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts:

  • compile/h100/test_startup.py::test_model_startup[deepseek_v3.2]AssertionError: warm_artifacts_saved: got 0
  • compile/h100/test_startup.py::test_model_startup[kimi_k2.5]AssertionError: warm_artifacts_saved: got 0
  • compile/h100/test_startup.py::test_moe_startup[0]KeyError: None

Same tests pass on torch 2.11.0. Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Root Cause

Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts:

RAW_BUFFERClick to expand / collapse

Summary

Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts:

  • compile/h100/test_startup.py::test_model_startup[deepseek_v3.2]AssertionError: warm_artifacts_saved: got 0
  • compile/h100/test_startup.py::test_model_startup[kimi_k2.5]AssertionError: warm_artifacts_saved: got 0
  • compile/h100/test_startup.py::test_moe_startup[0]KeyError: None

Same tests pass on torch 2.11.0. Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Environment

  • torch: 2.12.0+cu130
  • triton: 3.7.0
  • CUDA: 13.0
  • Python: 3.12
  • GPU: H100

Question / diagnosis

In 2.11, vLLM's compiler_interface.standalone_compile(...) produces warm artifacts that get written to disk; in 2.12 the write count is 0, and at least one test_moe_startup test gets KeyError: None from somewhere in the cache lookup (probably a missing key returning None).

  • Is there a known change to torch._inductor.standalone_compile in 2.12 around how warm/aot-autograd artifacts get saved (naming, cache directory layout, or bypass conditions)?
  • Was the None key an intentional sentinel for "no cache entry", and something is now blindly indexing it?

vLLM call site: vllm/compilation/compiler_interface.py:351torch._inductor.standalone_compile(...).

Links

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

extent analysis

TL;DR

The most likely fix is to investigate changes in torch._inductor.standalone_compile in PyTorch 2.12 and adjust the vLLM code to match the new behavior regarding warm/aot-autograd artifacts saving.

Guidance

  • Review the PyTorch 2.12 documentation and release notes for any changes to torch._inductor.standalone_compile that could affect warm artifact saving.
  • Investigate the compiler_interface.py file at line 351 to see how torch._inductor.standalone_compile is being called and if any adjustments can be made to accommodate the new behavior.
  • Check the cache directory layout and naming conventions used by torch._inductor.standalone_compile in PyTorch 2.12 to ensure compatibility with the vLLM code.
  • Consider adding error handling for the KeyError: None exception to prevent test failures and provide more informative error messages.

Example

No specific code example can be provided without more information about the changes in torch._inductor.standalone_compile. However, the vLLM code at vllm/compilation/compiler_interface.py:351 should be reviewed for potential adjustments.

Notes

The issue seems to be related to changes in PyTorch 2.12, and the vLLM code needs to be updated to match the new behavior. The KeyError: None exception suggests that the cache lookup is failing, and the code should be adjusted to handle this scenario.

Recommendation

Apply a workaround by adjusting the vLLM code to match the new behavior of torch._inductor.standalone_compile in PyTorch 2.12, as upgrading to a fixed version is not mentioned as an option.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING