pytorch - 💡(How to fix) Fix [vllm] [2.12 regression] Qwen2-VL vision-tower-only LoRA generation diverges from golden output [1 participants]

pytorch2026-04-24 17:47:40

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#181409•Fetched 2026-04-25 06:02:36

View on GitHub

Comments

Participants

Timeline

224

Reactions

Author

atalman

Participants

atalman

Timeline (top)

mentioned ×108subscribed ×108labeled ×7cross-referenced ×1

Under torch 2.12.0 + triton 3.7.0, vLLM's test_qwen2vl_multiple_lora_types starts failing on the third parametrized path (vision-tower-only LoRA, no connector, lora_id=5/6). The first two LoRA paths (language-only, tower+connector) in the same test pass; only the vision-tower-only adapter diverges:

AssertionError: Generated text "A view of the Tokyo" doesn't match expected pattern "A closeup shot of the Tokyo Skytree with pink flowers in the foreground."

Passes on torch 2.11 and on the same torch-2.12 branch through 2026-04-22 (builds 62138/62232/62495/62583); newly failing on 2026-04-24 (build 62848). Blocking the torch 2.12 upgrade for vLLM (vllm-project/vllm#40077).

Root Cause

AssertionError: Generated text "A view of the Tokyo" doesn't match expected pattern "A closeup shot of the Tokyo Skytree with pink flowers in the foreground."

Code Example

tests/lora/test_qwenvl.py::test_qwen2vl_multiple_lora_types

---

# Test 3: Vision tower only LoRA adapter (no connector)
tester.config.lora_path = qwen2vl_vision_tower_lora_files
for lora_id in [5, 6]:
    tester.run_test(
        TEST_IMAGES,
        expected_outputs=EXPECTED_OUTPUTS_VISION_NO_CONNECTOR,
        lora_id=lora_id,
        lora_name="vision_tower_only",
    )

---

for generated, expected in zip(generated_texts, expected_outputs):
    assert expected.startswith(generated), (
        f"Generated text {generated} doesn't match expected pattern {expected}"
    )

RAW_BUFFERClick to expand / collapse

Summary

AssertionError: Generated text "A view of the Tokyo" doesn't match expected pattern "A closeup shot of the Tokyo Skytree with pink flowers in the foreground."

Environment

torch: 2.12.0+cu130 (test channel)
triton: 3.7.0
CUDA: 13.0 / Driver: 570.133.20
Python: 3.12.13
Base model: Qwen2-VL (QWEN2VL_MODEL_PATH)
LoRA adapter: prashanth058/qwen2vl-flickr-lora-tower (vision tower only, no connector)

Reproduction

Failing test:

tests/lora/test_qwenvl.py::test_qwen2vl_multiple_lora_types

The test runs three LoRA configurations on the same LLM instance; the failure hits on the third:

# Test 3: Vision tower only LoRA adapter (no connector)
tester.config.lora_path = qwen2vl_vision_tower_lora_files
for lora_id in [5, 6]:
    tester.run_test(
        TEST_IMAGES,
        expected_outputs=EXPECTED_OUTPUTS_VISION_NO_CONNECTOR,
        lora_id=lora_id,
        lora_name="vision_tower_only",
    )

Assertion:

for generated, expected in zip(generated_texts, expected_outputs):
    assert expected.startswith(generated), (
        f"Generated text {generated} doesn't match expected pattern {expected}"
    )

Observed:

Generated: "A view of the Tokyo"
Expected: "A closeup shot of the Tokyo Skytree with pink flowers in the foreground."

(The assertion uses expected.startswith(generated), so vLLM output must be a prefix of the golden; here it diverges at token ~3.)

Reproducibility on torch 2.12 branch

Build	Date	LoRA 2
62138	2026-04-20	passed
62232	2026-04-21	passed
62495	2026-04-22	passed
62583	2026-04-22	passed
62848	2026-04-24	failed — https://buildkite.com/vllm/ci/builds/62848#019dbf56-e7ea-4cd5-bab6-dcbb4fb4da0e

Passes on same-day main build (torch 2.11):

2026-04-24 nightly: https://buildkite.com/vllm/ci/builds/62812 (LoRA 2 passed)

Relationship to other umbrella issues

Qwen2-VL-family regressions are also tracked as:

pytorch/pytorch#181168 — base Qwen2-VL multi-image output divergence (different test, different assertion; small mid-sentence word swap)
pytorch/pytorch#181249 — Qwen2.5-Math-PRM-7B reward-logits divergence

This LoRA case is distinct: it only triggers on the vision-tower LoRA adapter code path, not the base inference path, and the divergence is much larger (different sentence start, not a one-word swap).

Diagnosis request

Only the vision-tower-only LoRA fails; language-only and tower+connector LoRA paths pass in the same test on the same run. That narrows the suspect region to the vision-tower LoRA forward path under torch 2.12 / triton 3.7 (possibly a matmul or attention kernel interacting with LoRA's low-rank update). The regression is new between 2026-04-22 and 2026-04-24 — either a torch/triton test-wheel rebuild or a vLLM rebase could be the trigger; bisecting against vLLM commits on release_212_tests should be quick since the base PR is unchanged.

extent analysis

TL;DR

The most likely fix is to investigate and potentially revert changes introduced between torch builds 62583 and 62848, focusing on the vision-tower LoRA adapter code path.

Guidance

Bisect commits: Perform a binary search on the commits between the last passing build (62583) and the first failing build (62848) to identify the specific commit causing the regression.
Investigate matmul or attention kernel changes: Examine any recent changes to matrix multiplication or attention mechanisms in the torch or triton libraries, as these may interact with LoRA's low-rank update.
Verify LoRA adapter code: Review the vision-tower LoRA adapter code for any potential issues or inconsistencies that could cause the divergence.
Test with previous torch version: Run the test with torch 2.11 to confirm that the issue is indeed specific to torch 2.12.

Example

No code snippet is provided as the issue is more related to investigating and debugging rather than applying a specific code fix.

Notes

The exact cause of the issue is still unknown, and further investigation is required to determine the root cause. The provided information suggests that the issue is specific to the vision-tower LoRA adapter code path and torch 2.12.

Recommendation

Apply a workaround by reverting to a previous torch version (e.g., 2.11) until the issue is resolved, as the problem seems to be introduced in the newer version of torch.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#model loading #dependency error #configuration error #environment variable #network issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix [vllm] [2.12 regression] Qwen2-VL vision-tower-only LoRA generation diverges from golden output [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Environment

Reproduction

Reproducibility on torch 2.12 branch

Relationship to other umbrella issues

Diagnosis request

Links

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix [vllm] [2.12 regression] Qwen2-VL vision-tower-only LoRA generation diverges from golden output [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Environment

Reproduction

Reproducibility on torch 2.12 branch

Relationship to other umbrella issues

Diagnosis request

Links

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING