vllm - ✅(Solved) Fix [Bug]: Gemma4 LoRA can be zeroed out by duplicate alias-path registration in `LoRAModelManager` [1 pull requests, 1 participants]

ShubyM · 2026-04-14T18:03:31Z

[vllm] PR 39816: Bugfix : deduplicate aliased LoRA modules - Repository: vllm-project/vllm - Author: ShubyM - State: open | merged: False - Link: https://githu… # PR #39816: [Bugfix]: deduplicate aliased LoRA modules - Repository: vllm-project/vllm - Author: ShubyM - State: open | merged: False - Link: https://github.com/vllm-project/vllm/pull/39816 ## Description (problem / solution / changelog) Fixes: #39815 ## Purpose This PR deduplicates module traversal in `_create_lora_modules()` so LoRA registration happens once per physical module instead of once per alias path. The previous implementation used: ```python self.model.named_modules(remove_duplicate=False) ``` That is not safe for a side-effectful manager like `LoRAModelManager`, because multiple names can refer to the same live module object. This change switches registration to the default deduplicated traversal and adds a regression test covering aliased child-module paths. For more background and the full root-cause analysis, see the linked issue. ## Test Plan Run the focused regression test: ```bash python -m pytest tests/lora/test_model_manager_aliasing.py -q ``` ## Test Result Regression test result: ```text PASSED ``` The regression test verifies: - aliased paths can refer to the same child module - the fixed manager registers only one LoRA wrapper for that shared module - under simulated duplicate traversal, the same live wrapper can be set and then reset, leaving final LoRA buffers at zero Also worth noting that this bug was surfaced after aliasing was introduced in the gemma4 implementation in this PR: #38879 --- Essential Elements of an Effective PR Description Checklist - [x] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [x] The test plan, such as providing test command. - [x] The test results, such as pasting the results comparison before and after, or e2e results - [ ] (Optional) The necessary documentation update, such as updating `supported_models.md` and `examples` for a new model. - [ ] (Optional) Release notes update. If your change is user facing, please update the release notes draft in the [Google Doc](https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0). ## Changed files - `tests/lora/test_model_manager_aliasing.py` (added, +204/-0) - `vllm/lora/model_manager.py` (modified, +1/-1) ## Fixed - Fixed by PR: [Bugfix]: deduplicate aliased LoRA modules (https://github.com/vllm-project/vllm/pull/39816) ### 🐛 Describe the bug Gemma4-style aliased module paths can cause LoRA activation to zero itself out in `LoRAModelManager`. The issue is not that aliasing exists. The issue is that LoRA module creation was registering the same physical module multiple times under different names. ## Root Cause `LoRAModelManager._create_lora_modules()` was traversing the model with: ```python self.model.named_modules(remove_duplicate=False) ``` That is unsafe for LoRA registration because this manager mutates live module state during adapter activation. With aliased paths, the same physical module can be discovered under both a canonical path and an alias path. A Gemma4-style example is: - `layers.0.proj` - `self_decoder.decoder_layers.0.proj` If the adapter only contains weights for the canonical path, activation can do: 1. `set_lora(...)` for `layers.0.proj` 2. later `reset_lora(...)` for `self_decoder.decoder_layers.0.proj` Because both names refer to the same live wrapper, the final LoRA buffers end up zero. ## Why This Shows Up With Gemma4 The issue is exposed by Gemma4-style aliasing, where the same decoder child module is reachable through more than one registered path. This is a valid model structure. The bug is that `LoRAModelManager` treated those alias paths as separate LoRA-managed modules even though they point to the same live object. ## Proposed Fix Deduplicate traversal in `_create_lora_modules()`: ```python for module_name, module in self.model.named_modules(): ``` This preserves one LoRA registration per physical module while leaving aliasing itself untouched. ### Before submitting a new issue... - [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

vllm2026-04-14 18:03:31

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#39815•Fetched 2026-04-16 06:36:28

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ShubyM

Participants

ShubyM

Timeline (top)

cross-referenced ×1labeled ×1

Root Cause

LoRAModelManager._create_lora_modules() was traversing the model with:

self.model.named_modules(remove_duplicate=False)

That is unsafe for LoRA registration because this manager mutates live module state during adapter activation.

With aliased paths, the same physical module can be discovered under both a canonical path and an alias path. A Gemma4-style example is:

layers.0.proj
self_decoder.decoder_layers.0.proj

If the adapter only contains weights for the canonical path, activation can do:

set_lora(...) for layers.0.proj
later reset_lora(...) for self_decoder.decoder_layers.0.proj

Because both names refer to the same live wrapper, the final LoRA buffers end up zero.

Fix Action

Fixed

Fixed by PR: [Bugfix]: deduplicate aliased LoRA modules (https://github.com/vllm-project/vllm/pull/39816)

PR fix notes

PR #39816: [Bugfix]: deduplicate aliased LoRA modules

Repository: vllm-project/vllm
Author: ShubyM
State: open | merged: False
Link: https://github.com/vllm-project/vllm/pull/39816

Description (problem / solution / changelog)

Fixes: #39815

Purpose

This PR deduplicates module traversal in _create_lora_modules() so LoRA registration happens once per physical module instead of once per alias path.

The previous implementation used:

self.model.named_modules(remove_duplicate=False)

That is not safe for a side-effectful manager like LoRAModelManager, because multiple names can refer to the same live module object.

This change switches registration to the default deduplicated traversal and adds a regression test covering aliased child-module paths.

For more background and the full root-cause analysis, see the linked issue.

Test Plan

Run the focused regression test:

python -m pytest tests/lora/test_model_manager_aliasing.py -q

Test Result

Regression test result:

PASSED

The regression test verifies:

aliased paths can refer to the same child module
the fixed manager registers only one LoRA wrapper for that shared module
under simulated duplicate traversal, the same live wrapper can be set and then reset, leaving final LoRA buffers at zero

Also worth noting that this bug was surfaced after aliasing was introduced in the gemma4 implementation in this PR: #38879

<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

</details>

Changed files

tests/lora/test_model_manager_aliasing.py (added, +204/-0)
vllm/lora/model_manager.py (modified, +1/-1)

Code Example

self.model.named_modules(remove_duplicate=False)

---

for module_name, module in self.model.named_modules():

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Gemma4-style aliased module paths can cause LoRA activation to zero itself out in LoRAModelManager.

The issue is not that aliasing exists. The issue is that LoRA module creation was registering the same physical module multiple times under different names.

Root Cause

LoRAModelManager._create_lora_modules() was traversing the model with:

self.model.named_modules(remove_duplicate=False)

That is unsafe for LoRA registration because this manager mutates live module state during adapter activation.

With aliased paths, the same physical module can be discovered under both a canonical path and an alias path. A Gemma4-style example is:

layers.0.proj
self_decoder.decoder_layers.0.proj

If the adapter only contains weights for the canonical path, activation can do:

set_lora(...) for layers.0.proj
later reset_lora(...) for self_decoder.decoder_layers.0.proj

Because both names refer to the same live wrapper, the final LoRA buffers end up zero.

Why This Shows Up With Gemma4

The issue is exposed by Gemma4-style aliasing, where the same decoder child module is reachable through more than one registered path.

This is a valid model structure. The bug is that LoRAModelManager treated those alias paths as separate LoRA-managed modules even though they point to the same live object.

Proposed Fix

Deduplicate traversal in _create_lora_modules():

for module_name, module in self.model.named_modules():

This preserves one LoRA registration per physical module while leaving aliasing itself untouched.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

To fix the issue where LoRA activation zeros itself out due to aliased module paths, deduplicate the module traversal in LoRAModelManager._create_lora_modules().

Guidance

The issue arises from LoRAModelManager treating aliased paths as separate LoRA-managed modules, even though they point to the same live object.
To verify the issue, check if the same physical module is being registered multiple times under different names, causing LoRA buffers to end up zero.
The proposed fix involves deduplicating the traversal in _create_lora_modules() by using self.model.named_modules() without remove_duplicate=False.
To mitigate the issue, ensure that the adapter only contains weights for the canonical path, and avoid using aliased paths for LoRA registration.

Example

# Before
self.model.named_modules(remove_duplicate=False)

# After
for module_name, module in self.model.named_modules():
    # Register LoRA module
    pass

Notes

This fix assumes that the issue is solely caused by the duplicate registration of aliased modules. If other factors contribute to the problem, additional debugging may be necessary.

Recommendation

Apply the proposed fix by deduplicating the module traversal in LoRAModelManager._create_lora_modules(), as it directly addresses the root cause of the issue and preserves the validity of aliasing itself.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#LLM response #prompt template #agent execution #callback error #memory management

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - ✅(Solved) Fix [Bug]: Gemma4 LoRA can be zeroed out by duplicate alias-path registration in `LoRAModelManager` [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #39816: [Bugfix]: deduplicate aliased LoRA modules

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

🐛 Describe the bug

Root Cause

Why This Shows Up With Gemma4

Proposed Fix

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

vllm - ✅(Solved) Fix [Bug]: Gemma4 LoRA can be zeroed out by duplicate alias-path registration in `LoRAModelManager` [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #39816: [Bugfix]: deduplicate aliased LoRA modules

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

🐛 Describe the bug

Root Cause

Why This Shows Up With Gemma4

Proposed Fix

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING