pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][compile] test_standalone_compile_correctness: VLLM_USE_MEGA_AOT_ARTIFACT=1 without VLLM_USE_STANDALONE_COMPILE=1 + Inductor standalone_compile artifact save failures

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

tests/compile/test_aot_compile.py::test_standalone_compile_correctness (a compare_two_settings test that boots two RemoteOpenAIServer subprocesses with OPT-125m) fails on torch 2.12 with the engine subprocess crashing immediately:

File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/backends.py", line 97, in make_compiler
    assert not envs.VLLM_USE_MEGA_AOT_ARTIFACT or envs.VLLM_USE_STANDALONE_COMPILE, (
AssertionError: VLLM_USE_MEGA_AOT_ARTIFACT=1 requires VLLM_USE_STANDALONE_COMPILE=1

The vLLM precondition expects that whenever the test launches the second server with VLLM_USE_MEGA_AOT_ARTIFACT=1, VLLM_USE_STANDALONE_COMPILE=1 is also set. On torch 2.11 this combination is valid (test passes on every recent main Full CI run — #66975, #66835, #66759). On torch 2.12 the same test, same env permutation, fails this assertion — which suggests that one side of the env-var pair changed its torch-version-dependent default.

In the same job, several other compile tests print Inductor warnings while still passing:

W0520 03:23:31 torch/_inductor/standalone_compile.py:469] standalone_compile artifact generation failed,
cannot save. Run with TORCH_LOGS=+torch._inductor.codecache to identify the issue

(test_sym_size_whole_shape_boundary, test_shape_boundary_standalone_compile, test_sym_size_metadata_propagated, … all PASS but log the warning.) That points at a torch 2.12 Inductor standalone_compile regression that is silently swallowed in the warmup tests but fatal in the AOT-correctness test that depends on the cached artifact being saved and reloaded.

Error Message

File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/backends.py", line 97, in make_compiler assert not envs.VLLM_USE_MEGA_AOT_ARTIFACT or envs.VLLM_USE_STANDALONE_COMPILE, ( AssertionError: VLLM_USE_MEGA_AOT_ARTIFACT=1 requires VLLM_USE_STANDALONE_COMPILE=1

Root Cause

tests/compile/test_aot_compile.py::test_standalone_compile_correctness (a compare_two_settings test that boots two RemoteOpenAIServer subprocesses with OPT-125m) fails on torch 2.12 with the engine subprocess crashing immediately:

File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/backends.py", line 97, in make_compiler
    assert not envs.VLLM_USE_MEGA_AOT_ARTIFACT or envs.VLLM_USE_STANDALONE_COMPILE, (
AssertionError: VLLM_USE_MEGA_AOT_ARTIFACT=1 requires VLLM_USE_STANDALONE_COMPILE=1

The vLLM precondition expects that whenever the test launches the second server with VLLM_USE_MEGA_AOT_ARTIFACT=1, VLLM_USE_STANDALONE_COMPILE=1 is also set. On torch 2.11 this combination is valid (test passes on every recent main Full CI run — #66975, #66835, #66759). On torch 2.12 the same test, same env permutation, fails this assertion — which suggests that one side of the env-var pair changed its torch-version-dependent default.

In the same job, several other compile tests print Inductor warnings while still passing:

W0520 03:23:31 torch/_inductor/standalone_compile.py:469] standalone_compile artifact generation failed,
cannot save. Run with TORCH_LOGS=+torch._inductor.codecache to identify the issue

(test_sym_size_whole_shape_boundary, test_shape_boundary_standalone_compile, test_sym_size_metadata_propagated, … all PASS but log the warning.) That points at a torch 2.12 Inductor standalone_compile regression that is silently swallowed in the warmup tests but fatal in the AOT-correctness test that depends on the cached artifact being saved and reloaded.

Code Example

File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/backends.py", line 97, in make_compiler
    assert not envs.VLLM_USE_MEGA_AOT_ARTIFACT or envs.VLLM_USE_STANDALONE_COMPILE, (
AssertionError: VLLM_USE_MEGA_AOT_ARTIFACT=1 requires VLLM_USE_STANDALONE_COMPILE=1

---

W0520 03:23:31 torch/_inductor/standalone_compile.py:469] standalone_compile artifact generation failed,
cannot save. Run with TORCH_LOGS=+torch._inductor.codecache to identify the issue

---

pytest -x tests/compile/test_aot_compile.py::test_standalone_compile_correctness
RAW_BUFFERClick to expand / collapse

Summary

tests/compile/test_aot_compile.py::test_standalone_compile_correctness (a compare_two_settings test that boots two RemoteOpenAIServer subprocesses with OPT-125m) fails on torch 2.12 with the engine subprocess crashing immediately:

File "/usr/local/lib/python3.12/dist-packages/vllm/compilation/backends.py", line 97, in make_compiler
    assert not envs.VLLM_USE_MEGA_AOT_ARTIFACT or envs.VLLM_USE_STANDALONE_COMPILE, (
AssertionError: VLLM_USE_MEGA_AOT_ARTIFACT=1 requires VLLM_USE_STANDALONE_COMPILE=1

The vLLM precondition expects that whenever the test launches the second server with VLLM_USE_MEGA_AOT_ARTIFACT=1, VLLM_USE_STANDALONE_COMPILE=1 is also set. On torch 2.11 this combination is valid (test passes on every recent main Full CI run — #66975, #66835, #66759). On torch 2.12 the same test, same env permutation, fails this assertion — which suggests that one side of the env-var pair changed its torch-version-dependent default.

In the same job, several other compile tests print Inductor warnings while still passing:

W0520 03:23:31 torch/_inductor/standalone_compile.py:469] standalone_compile artifact generation failed,
cannot save. Run with TORCH_LOGS=+torch._inductor.codecache to identify the issue

(test_sym_size_whole_shape_boundary, test_shape_boundary_standalone_compile, test_sym_size_metadata_propagated, … all PASS but log the warning.) That points at a torch 2.12 Inductor standalone_compile regression that is silently swallowed in the warmup tests but fatal in the AOT-correctness test that depends on the cached artifact being saved and reloaded.

Environment

  • torch==2.12.0+cu130
  • triton==3.7.0
  • torchvision==0.27.0
  • CUDA 13.0
  • GPU: H200 (Buildkite h200-ci-3-19)
  • Python 3.12
  • vLLM commit d2792bf2088c (PR vllm-project/vllm#42848)
  • Model: facebook/opt-125m (OPT-125m)

Reproduction

pytest -x tests/compile/test_aot_compile.py::test_standalone_compile_correctness

The test calls compare_two_settings(...) which spawns two RemoteOpenAIServer subprocesses; the second subprocess sets VLLM_USE_MEGA_AOT_ARTIFACT=1 and the engine asserts in vllm/compilation/backends.py:97.

Failing test

tests/compile/test_aot_compile.py::test_standalone_compile_correctness

Diagnosis question

Two related questions:

  1. Did the semantics of VLLM_USE_STANDALONE_COMPILE or the default behavior of torch._inductor.standalone_compile.compile_fx_with_dynamo_aot change in torch 2.12 such that vLLM's env-var precondition stopped holding under what was previously a valid combination? This would help decide whether the fix is vLLM-side (set both env vars in the test) or torch-side (preserve 2.11 behavior).
  2. The standalone_compile artifact generation failed, cannot save warnings appearing in the same job point at a torch-side codecache regression — is the failure to save the standalone artifact intentional in 2.12 (e.g. cache schema change), or a bug we should fix?

The warnings only appear on torch 2.12 — on torch 2.11 (same vLLM commits, all five recent main builds), this job passes with no such warnings and test_standalone_compile_correctness passes.

Links

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo @avikchaudhuri @zhxchen17 @tugsbayasgalan @angelayi @ydwu4 @desertfire @yushangdi @iupaikov-amd

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][compile] test_standalone_compile_correctness: VLLM_USE_MEGA_AOT_ARTIFACT=1 without VLLM_USE_STANDALONE_COMPILE=1 + Inductor standalone_compile artifact save failures