vllm - ✅(Solved) Fix [Feature]: Add support for token_adapter.trainable_tokens_delta LoRA weight [1 pull requests, 1 participants]

vllm2026-03-09 13:21:38

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#36501•Fetched 2026-04-08 00:36:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

akowalsk

Participants

akowalsk

Timeline (top)

cross-referenced ×1labeled ×1

Error Message

ValueError: Call to add_lora method failed: base_model.model.language_model.model.embed_tokens.token_adapter.trainable_tokens_delta is unsupported LoRA weight

Fix Action

Fixed

Fixed by PR: [LoRA] Add support for PEFT trainable_tokens_delta weights (https://github.com/vllm-project/vllm/pull/36648)

PR fix notes

PR #36648: [LoRA] Add support for PEFT trainable_tokens_delta weights

Repository: vllm-project/vllm
Author: haosdent
State: open | merged: False
Link: https://github.com/vllm-project/vllm/pull/36648

Description (problem / solution / changelog)

Purpose

Adds support for PEFT's trainable_token_indices feature (see PEFT docs), which allows efficient training of embedding deltas for specific token indices alongside standard LoRA adapters.

Previously, loading an adapter with trainable_tokens_delta weights would fail with:

ValueError: Call to add_lora method failed: base_model.model.language_model.model.embed_tokens.token_adapter.trainable_tokens_delta is unsupported LoRA weight

The fix converts the dense per-token delta tensor into equivalent sparse LoRA embedding weights (lora_embedding_A / lora_embedding_B) at load time, so the existing forward path handles them without any kernel or layer changes.

Fixes #36501

Test Plan

Added 5 new unit tests in tests/lora/test_utils.py:

test_is_trainable_tokens_delta — verifies weight name detection
test_parse_trainable_tokens_delta_name — verifies module name extraction (with/without base_model.model. prefix, VLM-style paths)
test_parse_trainable_tokens_delta_name_with_weights_mapper — verifies WeightsMapper integration
test_from_lora_tensors_trainable_tokens_delta — end-to-end test that the delta tensor is correctly converted to sparse one-hot lora_a + transposed lora_b with scaling=1.0
test_from_lora_tensors_trainable_tokens_delta_missing_config — verifies error when trainable_token_indices is missing from adapter config

Test Result

tests/lora/test_utils.py::test_is_trainable_tokens_delta PASSED
tests/lora/test_utils.py::test_parse_trainable_tokens_delta_name PASSED
tests/lora/test_utils.py::test_parse_trainable_tokens_delta_name_with_weights_mapper PASSED
tests/lora/test_utils.py::test_from_lora_tensors_trainable_tokens_delta PASSED
tests/lora/test_utils.py::test_from_lora_tensors_trainable_tokens_delta_missing_config PASSED

======================= 13 passed in 2.11s =======================

Changed files

tests/lora/test_utils.py (modified, +137/-0)
vllm/lora/lora_model.py (modified, +73/-0)
vllm/lora/peft_helper.py (modified, +3/-0)
vllm/lora/utils.py (modified, +34/-0)

Code Example

ValueError: Call to add_lora method failed: base_model.model.language_model.model.embed_tokens.token_adapter.trainable_tokens_delta is unsupported LoRA weight

---

peft_config = LoraConfig(
    r=64,
    lora_alpha=128,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    ensure_weight_tying=True,
    trainable_token_indices={"embed_tokens": new_token_indices},
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

I have gotten good results in transformers training an adapter on custom tokens using the https://huggingface.co/docs/peft/developer_guides/lora#efficiently-train-tokens-alongside-lora method. However, when I go to use those adapters in vLLM, I get the message:

ValueError: Call to add_lora method failed: base_model.model.language_model.model.embed_tokens.token_adapter.trainable_tokens_delta is unsupported LoRA weight

My LoraConfig is below and I'm training on Qwen3-VL-4B-Instruct.

peft_config = LoraConfig(
    r=64,
    lora_alpha=128,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    ensure_weight_tying=True,
    trainable_token_indices={"embed_tokens": new_token_indices},
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

I'm adding fewer than 10 new tokens, so not only does this give me much better results than training the entire embedding module, the resulting adapter is less than quarter of the size.

Is this something that can be added as a supported layer type?

Alternatives

No response

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

Fix Plan

To resolve the ValueError caused by an unsupported LoRA weight, we need to adjust the LoraConfig to ensure compatibility with the vLLM model. The error message indicates that the trainable_tokens_delta attribute is not supported.

Step-by-Step Solution

Update LoraConfig: Modify the LoraConfig to exclude trainable_token_indices or adjust the target_modules to avoid using embed_tokens.token_adapter.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #environment setup #docker error #permission error #memory optimization #batch processing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - ✅(Solved) Fix [Feature]: Add support for token_adapter.trainable_tokens_delta LoRA weight [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #36648: [LoRA] Add support for PEFT trainable_tokens_delta weights

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

Fix Plan

Step-by-Step Solution

Still need to ship something?

TRENDING

vllm - ✅(Solved) Fix [Feature]: Add support for token_adapter.trainable_tokens_delta LoRA weight [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #36648: [LoRA] Add support for PEFT trainable_tokens_delta weights

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

Fix Plan

Step-by-Step Solution

Still need to ship something?

RELATED_DISCOVERY

TRENDING