pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][Inductor] warm_artifacts_saved == 0 and KeyError(None) in standalone_compile startup tests

StepCodex · 2026-04-20T20:41:01Z

[pytorch] Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts: - compile/h1… Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts: - `compile/h100/test_startup.py::test_model_startup[deepseek_v3.2]` → `AssertionError: warm_artifacts_saved: got 0` - `compile/h100/test_startup.py::test_model_startup[kimi_k2.5]` → `AssertionError: warm_artifacts_saved: got 0` - `compile/h100/test_startup.py::test_moe_startup[0]` → `KeyError: None` Same tests pass on torch 2.11.0. Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077). ## Summary Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts: - `compile/h100/test_startup.py::test_model_startup[deepseek_v3.2]` → `AssertionError: warm_artifacts_saved: got 0` - `compile/h100/test_startup.py::test_model_startup[kimi_k2.5]` → `AssertionError: warm_artifacts_saved: got 0` - `compile/h100/test_startup.py::test_moe_startup[0]` → `KeyError: None` Same tests pass on torch 2.11.0. Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077). ## Environment - `torch`: 2.12.0+cu130 - `triton`: 3.7.0 - CUDA: 13.0 - Python: 3.12 - GPU: H100 ## Question / diagnosis In 2.11, vLLM's `compiler_interface.standalone_compile(...)` produces warm artifacts that get written to disk; in 2.12 the write count is 0, and at least one `test_moe_startup` test gets `KeyError: None` from somewhere in the cache lookup (probably a missing key returning `None`). - Is there a known change to `torch._inductor.standalone_compile` in 2.12 around how warm/aot-autograd artifacts get saved (naming, cache directory layout, or bypass conditions)? - Was the `None` key an intentional sentinel for "no cache entry", and something is now blindly indexing it? vLLM call site: `vllm/compilation/compiler_interface.py:351` → `torch._inductor.standalone_compile(...)`. ## Links - vLLM PR: https://github.com/vllm-project/vllm/pull/40077 - Failing build: https://buildkite.com/vllm/ci/builds/62138 - Failed job (PyTorch Compilation Unit Tests H100): https://buildkite.com/vllm/ci/builds/62138#019dab36-87f7-449c-b761-dcc9c2cac612 - Umbrella: #180899 cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

pytorch2026-04-20 20:41:01

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts:

compile/h100/test_startup.py::test_model_startup[deepseek_v3.2] → AssertionError: warm_artifacts_saved: got 0
compile/h100/test_startup.py::test_model_startup[kimi_k2.5] → AssertionError: warm_artifacts_saved: got 0
compile/h100/test_startup.py::test_moe_startup[0] → KeyError: None

Same tests pass on torch 2.11.0. Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Root Cause

Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts:

RAW_BUFFERClick to expand / collapse

Summary

Under torch 2.12.0, vLLM's H100 compile-startup tests fail because the inductor standalone-compile warm cache no longer saves artifacts:

compile/h100/test_startup.py::test_model_startup[deepseek_v3.2] → AssertionError: warm_artifacts_saved: got 0
compile/h100/test_startup.py::test_model_startup[kimi_k2.5] → AssertionError: warm_artifacts_saved: got 0
compile/h100/test_startup.py::test_moe_startup[0] → KeyError: None

Same tests pass on torch 2.11.0. Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Environment

torch: 2.12.0+cu130
triton: 3.7.0
CUDA: 13.0
Python: 3.12
GPU: H100

Question / diagnosis

In 2.11, vLLM's compiler_interface.standalone_compile(...) produces warm artifacts that get written to disk; in 2.12 the write count is 0, and at least one test_moe_startup test gets KeyError: None from somewhere in the cache lookup (probably a missing key returning None).

Is there a known change to torch._inductor.standalone_compile in 2.12 around how warm/aot-autograd artifacts get saved (naming, cache directory layout, or bypass conditions)?
Was the None key an intentional sentinel for "no cache entry", and something is now blindly indexing it?

vLLM call site: vllm/compilation/compiler_interface.py:351 → torch._inductor.standalone_compile(...).

extent analysis

TL;DR

The most likely fix is to investigate changes in torch._inductor.standalone_compile in PyTorch 2.12 and adjust the vLLM code to match the new behavior regarding warm/aot-autograd artifacts saving.

Guidance

Review the PyTorch 2.12 documentation and release notes for any changes to torch._inductor.standalone_compile that could affect warm artifact saving.
Investigate the compiler_interface.py file at line 351 to see how torch._inductor.standalone_compile is being called and if any adjustments can be made to accommodate the new behavior.
Check the cache directory layout and naming conventions used by torch._inductor.standalone_compile in PyTorch 2.12 to ensure compatibility with the vLLM code.
Consider adding error handling for the KeyError: None exception to prevent test failures and provide more informative error messages.

Example

No specific code example can be provided without more information about the changes in torch._inductor.standalone_compile. However, the vLLM code at vllm/compilation/compiler_interface.py:351 should be reviewed for potential adjustments.

Notes

The issue seems to be related to changes in PyTorch 2.12, and the vLLM code needs to be updated to match the new behavior. The KeyError: None exception suggests that the cache lookup is failing, and the code should be adjusted to handle this scenario.

Recommendation

Apply a workaround by adjusting the vLLM code to match the new behavior of torch._inductor.standalone_compile in PyTorch 2.12, as upgrading to a fixed version is not mentioned as an option.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][Inductor] warm_artifacts_saved == 0 and KeyError(None) in standalone_compile startup tests

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Summary

Environment

Question / diagnosis

Links

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][Inductor] warm_artifacts_saved == 0 and KeyError(None) in standalone_compile startup tests

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Summary

Environment

Question / diagnosis

Links

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING