pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][Dynamo] test_dynamic_shapes_compilation: assert 'no' == 'yes' across backed/unbacked modes

pytorch2026-04-20 20:41:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Under torch 2.12.0, vLLM's test_dynamic_shapes_compilation asserts 'no' == 'yes' across many parametrizations (gpt2, Qwen2-7B, Qwen3-4B) and symbolic-shape modes (backed, unbacked, backed_size_oblivious). 12 tests end with this assertion; the other 10 end in EngineCore init failure (some of those were confounded by GPU OOM in the same CI job — a clean rerun is planned; filing now to track the real-assertion subset).

Root Cause

A portion of this job's failures were caused by pre-existing GPU pressure on the shared runner (repeated ValueError: Free memory on device cuda:0 (1.3/22 GiB) on startup is less than desired GPU memory utilization). Those engine-core-init failures will be filtered out on rerun; the 12 assertion-style failures listed above progressed past engine-init and are almost certainly real.

RAW_BUFFERClick to expand / collapse

Summary

Environment

torch: 2.12.0+cu130
triton: 3.7.0
CUDA: 13.0
Python: 3.12
GPU: 1× H100 (shared, partially OOM at run time — rerun pending)

Failed test parametrizations (assert 'no' == 'yes')

12 of 22, e.g.:

test_dynamic_shapes_compilation[False-True-0-backed-gpt2]
test_dynamic_shapes_compilation[False-True-0-backed_size_oblivious-gpt2]
test_dynamic_shapes_compilation[False-True-1-backed-gpt2]
test_dynamic_shapes_compilation[False-True-1-backed-Qwen/Qwen3-4B-Instruct-2507]
test_dynamic_shapes_compilation[False-True-1-backed_size_oblivious-gpt2]
test_dynamic_shapes_compilation[False-True-1-backed_size_oblivious-Qwen/Qwen3-4B-Instruct-2507]
test_dynamic_shapes_compilation[False-False-0-unbacked-gpt2]
test_dynamic_shapes_compilation[False-False-0-backed_size_oblivious-gpt2]
test_dynamic_shapes_compilation[False-False-1-backed-gpt2]
test_dynamic_shapes_compilation[False-False-1-unbacked-gpt2]
test_dynamic_shapes_compilation[False-False-1-backed_size_oblivious-gpt2]
test_dynamic_shapes_compilation[False-False-1-backed_size_oblivious-Qwen/Qwen3-4B-Instruct-2507]

The assertion compares the is_compiled (or similar) string status — vLLM expected 'yes' but dynamo/inductor on 2.12 returns 'no', suggesting some graph that previously compiled is now breaking out to eager.

Question / diagnosis

Under torch 2.12, is there a new guard / cond / data-dependent branch that now causes test_dynamic_shapes_compilation graphs to fall back to eager (hence 'no')? Particularly in backed_size_oblivious mode, which affects all three models.

Caveat

Links

vLLM PR: https://github.com/vllm-project/vllm/pull/40077
Failing build: https://buildkite.com/vllm/ci/builds/62138
Failed job (PyTorch Compilation Unit Tests): https://buildkite.com/vllm/ci/builds/62138#019dab36-87f6-4257-b680-4c60073b3d82
Umbrella: #180899

cc @chauhang @penguinwu @ezyang @bobrenjc93 @aditvenk @laithsakka @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @amjames @Lucaskabela @jataylo @azahed98

extent analysis

TL;DR

The most likely fix involves investigating and addressing the changes in PyTorch 2.12 that cause graphs to fall back to eager execution, particularly in backed_size_oblivious mode.

Guidance

Review PyTorch 2.12 release notes: Check for any new guards, conditions, or data-dependent branches that might cause graphs to fall back to eager execution.
Investigate backed_size_oblivious mode: Focus on why this mode is affected across all three models (gpt2, Qwen2-7B, Qwen3-4B) and how it interacts with the changes in PyTorch 2.12.
Compare graph compilation: Analyze the differences in graph compilation between PyTorch 2.11 and 2.12 to identify what causes the fallback to eager execution.
Test with reduced GPU pressure: Ensure that the issue persists when GPU pressure is minimized to rule out any environmental factors.
Check vLLM's expectations: Verify that vLLM's expectations for is_compiled status are correct and align with the behavior changes in PyTorch 2.12.

Example

No specific code snippet can be provided without further details on the implementation, but reviewing the test_dynamic_shapes_compilation test cases and the backed_size_oblivious mode implementation would be a good starting point.

Notes

The investigation should consider the potential impact of CUDA 13.0 and PyTorch 2.12 interactions, as well as any other environmental factors that might influence the behavior.

Recommendation

Apply a workaround by identifying and addressing the specific changes in PyTorch 2.12 that cause the graphs to fall back to eager execution, particularly in backed_size_oblivious mode, as this seems to be the root cause of the issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #memory leak #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][Dynamo] test_dynamic_shapes_compilation: assert 'no' == 'yes' across backed/unbacked modes

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Summary

Environment

Failed test parametrizations (assert 'no' == 'yes')

Question / diagnosis

Caveat

Links

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix [vllm] [2.12 regression][Dynamo] test_dynamic_shapes_compilation: assert 'no' == 'yes' across backed/unbacked modes

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Summary

Environment

Failed test parametrizations (assert 'no' == 'yes')

Question / diagnosis

Caveat

Links

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING