transformers - 💡(How to fix) Fix when using ms-swift lora fine-tuning Qwen3.5-27B, each layer emits warning:You should update the config with `tie_word_embeddings=False` to silence this warning [4 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44368Fetched 2026-04-08 00:28:57
View on GitHub
Comments
4
Participants
4
Timeline
6
Reactions
0
Timeline (top)
commented ×4labeled ×1renamed ×1

Fix Action

Fix / Workaround

The tied weights mapping and config for this model specifies to tie model.visual.patch_embed.proj.weight to model.language_model.norm.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with tie_word_embeddings=False to silence this warning

RAW_BUFFERClick to expand / collapse

System Info

transformers==5.2.0 torch==2.8.0 deepspeed==0.18.6 python==3.10 ms-swift==4.0.0.dev0

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

4 * 30GiB

PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True'
NPROC_PER_NODE=2
MAX_PIXELS=1003520
VIDEO_MAX_PIXELS=50176
FPS_MAX_FRAMES=12
CUDA_VISIBLE_DEVICES=0,1
swift sft
--model Qwen3.5-27B
--tuner_type lora
--dataset alpaca-gpt4-data-zh
--load_from_cache_file true
--add_non_thinking_prefix true
--split_dataset_ratio 0.01
--torch_dtype bfloat16
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 1e-4
--lora_rank 8
--lora_alpha 16
--target_modules all-linear
--gradient_accumulation_steps 1
--output_dir output
--report_to tensorboard
--eval_steps 50
--save_steps 50
--save_total_limit 2
--logging_steps 1
--max_length 2048
--warmup_ratio 0.05
--dataloader_num_workers 4
--deepspeed zero3 \

output_lora.log 2>&1

Expected behavior

a warning is emitted for every layer:

The tied weights mapping and config for this model specifies to tie model.visual.patch_embed.proj.weight to model.language_model.norm.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with tie_word_embeddings=False to silence this warning

extent analysis

Fix Plan

Update Model Configuration

To silence the warning, you need to update the model configuration to set tie_word_embeddings=False.

Code Changes

import json

# Load the model configuration
with open('config.json') as f:
    config = json.load(f)

# Update the configuration
config['model']['tie_word_embeddings'] = False

# Save the updated configuration
with open('config.json', 'w') as f:
    json.dump(config, f, indent=4)

Alternatively, you can pass the tie_word_embeddings=False argument when creating the model.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    'Qwen3.5-27B',
    config={'tie_word_embeddings': False}
)

Verification

  1. Run the swift sft command again with the updated configuration.
  2. Check if the warning is still emitted.
  3. If the warning is silenced, verify that the model is running correctly by checking the output logs.

Extra Tips

  • Make sure to update the configuration file correctly to avoid any issues with the model.
  • If you are using a custom model, you may need to update the configuration file manually.
  • You can also update the configuration using the transformers library's ModelForCausalLM class.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

a warning is emitted for every layer:

The tied weights mapping and config for this model specifies to tie model.visual.patch_embed.proj.weight to model.language_model.norm.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with tie_word_embeddings=False to silence this warning

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING