transformers - 💡(How to fix) Fix when using ms-swift lora fine-tuning Qwen3.5-27B, each layer emits warning:You should update the config with `tie_word_embeddings=False` to silence this warning [4 comments, 4 participants]

Q: Expected behavior

a warning is emitted for every layer: The tied weights mapping and config for this model specifies to tie model.visual.patch_embed.proj.weight to model.language_model.norm.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning

transformers2026-03-01 07:25:46

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#44368•Fetched 2026-04-08 00:28:57

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×4labeled ×1renamed ×1

Fix Action

Fix / Workaround

The tied weights mapping and config for this model specifies to tie model.visual.patch_embed.proj.weight to model.language_model.norm.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with tie_word_embeddings=False to silence this warning

RAW_BUFFERClick to expand / collapse

System Info

transformers==5.2.0 torch==2.8.0 deepspeed==0.18.6 python==3.10 ms-swift==4.0.0.dev0

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

4 * 30GiB

PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True'
NPROC_PER_NODE=2
MAX_PIXELS=1003520
VIDEO_MAX_PIXELS=50176
FPS_MAX_FRAMES=12
CUDA_VISIBLE_DEVICES=0,1
swift sft
--model Qwen3.5-27B
--tuner_type lora
--dataset alpaca-gpt4-data-zh
--load_from_cache_file true
--add_non_thinking_prefix true
--split_dataset_ratio 0.01
--torch_dtype bfloat16
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 1e-4
--lora_rank 8
--lora_alpha 16
--target_modules all-linear
--gradient_accumulation_steps 1
--output_dir output
--report_to tensorboard
--eval_steps 50
--save_steps 50
--save_total_limit 2
--logging_steps 1
--max_length 2048
--warmup_ratio 0.05
--dataloader_num_workers 4
--deepspeed zero3 \

output_lora.log 2>&1

Expected behavior

a warning is emitted for every layer:

extent analysis

Fix Plan

Update Model Configuration

To silence the warning, you need to update the model configuration to set tie_word_embeddings=False.

Code Changes

import json

# Load the model configuration
with open('config.json') as f:
    config = json.load(f)

# Update the configuration
config['model']['tie_word_embeddings'] = False

# Save the updated configuration
with open('config.json', 'w') as f:
    json.dump(config, f, indent=4)

Alternatively, you can pass the tie_word_embeddings=False argument when creating the model.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    'Qwen3.5-27B',
    config={'tie_word_embeddings': False}
)

Verification

Run the swift sft command again with the updated configuration.
Check if the warning is still emitted.
If the warning is silenced, verify that the model is running correctly by checking the output logs.

Extra Tips

Make sure to update the configuration file correctly to avoid any issues with the model.
If you are using a custom model, you may need to update the configuration file manually.
You can also update the configuration using the transformers library's ModelForCausalLM class.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

a warning is emitted for every layer:

#api #ssr #installation #tensor shape #autograd error #configuration error #environment variable #network issue #logging issue #authentication issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - 💡(How to fix) Fix when using ms-swift lora fine-tuning Qwen3.5-27B, each layer emits warning:You should update the config with `tie_word_embeddings=False` to silence this warning [4 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

System Info

Who can help?

Information

Tasks

Reproduction

4 * 30GiB

Expected behavior

extent analysis

Fix Plan

Update Model Configuration

Code Changes

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

TRENDING

transformers - 💡(How to fix) Fix when using ms-swift lora fine-tuning Qwen3.5-27B, each layer emits warning:You should update the config with `tie_word_embeddings=False` to silence this warning [4 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

System Info

Who can help?

Information

Tasks

Reproduction

4 * 30GiB

Expected behavior

extent analysis

Fix Plan

Update Model Configuration

Code Changes

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING