transformers - ✅(Solved) Fix [Bug] Model collapse after merging LoRA with extended vocabulary on models with tie_word_embeddings=True (e.g., Qwen2.5 0.5B) [4 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#45127Fetched 2026-04-08 01:52:41
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
1
Participants
Timeline (top)
cross-referenced ×4commented ×1labeled ×1

Root Cause

Description: When extending the vocabulary size (e.g., adding audio/special tokens) on a base model that uses tied embeddings (config.tie_word_embeddings = True, such as Qwen), training a LoRA adapter, and subsequently merging it using peft_model.merge_and_unload(), the resulting saved model produces completely degraded outputs (repeating the same token infinitely) upon reloading. The issue occurs because peft correctly merges the separated embed_tokens and lm_head weights in memory, but when the merged model is saved, the config.tie_word_embeddings flag remains True. Upon reloading the merged model via AutoModelForCausalLM.from_pretrained(), the lm_head weights are overwritten by the embed_tokens weights (or vice versa) due to the tied config, completely destroying the newly trained weights for the extended vocabulary.

Fix Action

Fix / Workaround

Workaround / Proposed Fix: Manually setting tie_word_embeddings = False before saving the merged model completely resolves the issue, as it forces transformers to save and load both layers independently:

WORKAROUND: Untie embeddings before saving

merged_model.config.tie_word_embeddings = False

PR fix notes

PR #45135: Fix model saving corruption for dynamically untied embeddings

Description (problem / solution / changelog)

What does this PR do?

Fixes an issue where PEFT adapters applied independently to tied embeddings (embed_tokens and lm_head) cause silent model corruption upon reloading via AutoModelForCausalLM.from_pretrained().

Root Cause: When embeddings are untied dynamically during runtime (e.g., vocabulary resizing and independent PEFT merging), their tensor memory storage diverges. PreTrainedModel.save_pretrained() correctly saves both parameter tensors because remove_tied_weights_from_state_dict() sees they don't share identical storage. However, the model configuration saves tie_word_embeddings = True. Upon reloading, from_pretrained() sees tie_word_embeddings=True and aggressively re-ties the two embeddings by overwriting one parameter with the other, effectively destroying the independent delta weights. Fix: Included a check in save_pretrained(): if config.tie_word_embeddings is True but input_embeddings.weight.data_ptr() != output_embeddings.weight.data_ptr(), it automatically flips model.config.tie_word_embeddings to False before saving the configuration mapping, preventing silent destruction on loading.

<!-- Remove if not applicable -->

Fixes # #45127

Code Agent Policy

The Transformers repo is currently being overwhelmed by a large number of PRs and issue comments written by code agents. We are currently bottlenecked by our ability to review and respond to them. As a result, we ask that new users do not submit pure code agent PRs at this time. You may use code agents in drafting or to help you diagnose issues. We'd also ask autonomous "OpenClaw"-like agents not to open any PRs or issues for the moment.

PRs that appear to be fully agent-written will probably be closed without review, and we may block users who do this repeatedly or maliciously.

This is a rapidly-evolving situation that's causing significant shockwaves in the open-source community. As a result, this policy is likely to be updated regularly in the near future. For more information, please read CONTRIBUTING.md.

  • I confirm that this is not a pure code agent PR.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • Did you write any new necessary tests? (Note: A minimal reproduction script is provided below to easily test and validate the fix, as it requires mocking PEFT adaptation)

Reproduction Script

<details><summary>Click to view minimal repro (repro.py)</summary>
import torch
import gc
from transformers import AutoModelForCausalLM, AutoConfig

def main():
    # Make a tiny dummy model with tied embeddings
    config = AutoConfig.from_pretrained("Qwen/Qwen1.5-0.5B", trust_remote_code=True)
    config.hidden_size = 32
    config.intermediate_size = 64
    config.num_hidden_layers = 2
    config.num_attention_heads = 4
    config.num_key_value_heads = 4
    config.vocab_size = 1000
    config.tie_word_embeddings = True
    
    # Needs to be small to fit in memory
    model = AutoModelForCausalLM.from_config(config)
    print("Initial tie config:", model.config.tie_word_embeddings)
    print("Are weights tied initially?", id(model.get_input_embeddings().weight) == id(model.get_output_embeddings().weight))
    
    # Suppose PEFT does something and they are no longer tied
    model.get_output_embeddings().weight = torch.nn.Parameter(model.get_output_embeddings().weight.clone())
    model.get_output_embeddings().weight.data += 1.0 # Modify to make them different
    
    print("Are weights tied after fake PEFT?", id(model.get_input_embeddings().weight) == id(model.get_output_embeddings().weight))
    
    model.save_pretrained("./test_tied_model")
    
    # Reload
    model_reloaded = AutoModelForCausalLM.from_pretrained("./test_tied_model")
    print("Reloaded tie config:", model_reloaded.config.tie_word_embeddings)
    print("Are weights tied after reload?", id(model_reloaded.get_input_embeddings().weight) == id(model_reloaded.get_output_embeddings().weight))
    
    # Check if the output embeddings weight is equal to input embeddings
    print("Is the modified output weight preserved?", not torch.allclose(model_reloaded.get_output_embeddings().weight, model_reloaded.get_input_embeddings().weight))
if __name__ == "__main__":
    main()
</details>

Who can review?

@BenjaminBossan @githubnemo

<!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**. Please tag fewer than 3 people. Models: - text models: @ArthurZucker @Cyrilvallez - vision models: @yonigozlan @molbap - audio models: @eustlb @ebezzam @vasqu - multimodal models: @zucchini-nlp - graph models: @clefourrier Library: - generate: @zucchini-nlp (visual-language models) or @gante (all others) - continuous batching: @remi-or @ArthurZucker @McPatate - pipelines: @Rocketknight1 - tokenizers: @ArthurZucker and @itazap - trainer: @SunMarc - attention: @vasqu @ArthurZucker @CyrilVallez - model loading (from pretrained, etc): @CyrilVallez - distributed: @3outeille @ArthurZucker - CIs: @ydshieh Integrations: - ray/raytune: @richardliaw, @amogkam - Big Model Inference: @SunMarc - quantization: @SunMarc - kernels: @drbh - peft: @BenjaminBossan @githubnemo Devices/Backends: - AMD ROCm: @ivarflakstad - Intel XPU: @IlyasMoutawwakil - Ascend NPU: @ivarflakstad Documentation: @stevhliu Research projects are not maintained and should be taken as is. -->

Changed files

  • src/transformers/modeling_utils.py (modified, +44/-0)

PR #45136: Fix #45127: Auto-fix diverged tie_word_embeddings config on save to prevent silent weight corruption

Description (problem / solution / changelog)

What does this PR do?

This PR fixes a bug in PreTrainedModel.save_pretrained() where config.tie_word_embeddings can be inconsistent with the actual weight state, leading to silent model corruption for downstream consumers.

Problem

After PEFT merge_and_unload() (typical scenario: Qwen, Llama, Mistral, etc.), embed_tokens and lm_head weights are separated in memory with different values, but config.tie_word_embeddings remains True. Currently, save_pretrained() performs no validation of tie_word_embeddings against the actual weight state — the incorrect config is written to config.json as-is.

This causes two issues:

  1. Reloading via from_pretrained: The load-side safety check (modeling_utils.py:2535-2547) detects the inconsistency and refuses to tie, emitting a warning each time — but the config is still semantically wrong.
  2. Downstream tool consumption (GGUF converters, quantization scripts, etc.): These tools trust tie_word_embeddings: true in config.json directly, potentially causing silent weight corruption — one tensor overwrites the other, producing completely degraded outputs.

Fix

In save_pretrained(), before writing the config to disk, we detect whether the input/output embeddings have diverged. If so, we automatically set config.tie_word_embeddings = False and emit a warning.

Key safety considerations:

  • Only triggers when the output embedding key (e.g., lm_head.weight) is explicitly declared in the model's _tied_weights_keys mapping as tied to the input embedding. This prevents false positives on models like Pop2Piano, which uses tie_word_embeddings=True for decoder output scaling but does not declare lm_head.weight in its _tied_weights_keys (it only ties encoder.embed_tokens and decoder.embed_tokens to shared).
  • Cross-device scenarios (model parallelism / offloading) are skipped entirely to avoid false positives and potential OOM from implicit device copies.
  • T5 family safety analysis:
    • T5: Scaling is decoupled to an independent scale_decoder_outputs field (configuration_t5.py:82-83) and tie_word_embeddings is forced to True. Not affected.
    • UMT5: Config init forcibly overrides tie_word_embeddings = True — even if saved as False, it's restored on load. Not affected.
    • LongT5, Pop2Piano, SwitchTransformers: Still read tie_word_embeddings for scaling in forward, but these are guarded by the _tied_weights_keys check described above.

Changes

modeling_utils.py:

  • Added weight divergence detection + auto-fix logic before config saving in save_pretrained()
  • _tied_weights_keys guard to only auto-fix when the output embedding is declared as tied to the input embedding
  • Cross-device: skip check (avoid false positives and potential OOM)
  • NotImplementedError: silently ignored (expected for vision/speech backbones)
  • Other exceptions: logged via logger.debug

test_modeling_utils.py:

  • Added test_save_pretrained_auto_fixes_diverged_tied_embeddings: constructs a tied Llama model → simulates weight divergence (PEFT merge) → verifies saved config is corrected + warning is emitted + reloaded weights are preserved correctly
<!-- Remove if not applicable -->

Fixes # #45127

Code Agent Policy

The Transformers repo is currently being overwhelmed by a large number of PRs and issue comments written by code agents. We are currently bottlenecked by our ability to review and respond to them. As a result, we ask that new users do not submit pure code agent PRs at this time. You may use code agents in drafting or to help you diagnose issues. We'd also ask autonomous "OpenClaw"-like agents not to open any PRs or issues for the moment.

PRs that appear to be fully agent-written will probably be closed without review, and we may block users who do this repeatedly or maliciously.

This is a rapidly-evolving situation that's causing significant shockwaves in the open-source community. As a result, this policy is likely to be updated regularly in the near future. For more information, please read CONTRIBUTING.md.

  • I confirm that this is not a pure code agent PR.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @CyrilVallez @BenjaminBossan Models: - text models: @ArthurZucker @Cyrilvallez - vision models: @yonigozlan @molbap - audio models: @eustlb @ebezzam @vasqu - multimodal models: @zucchini-nlp - graph models: @clefourrier Library: - generate: @zucchini-nlp (visual-language models) or @gante (all others) - continuous batching: @remi-or @ArthurZucker @McPatate - pipelines: @Rocketknight1 - tokenizers: @ArthurZucker and @itazap - trainer: @SunMarc - attention: @vasqu @ArthurZucker @CyrilVallez - model loading (from pretrained, etc): @CyrilVallez - distributed: @3outeille @ArthurZucker - CIs: @ydshieh Integrations: - ray/raytune: @richardliaw, @amogkam - Big Model Inference: @SunMarc - quantization: @SunMarc - kernels: @drbh - peft: @BenjaminBossan @githubnemo Devices/Backends: - AMD ROCm: @ivarflakstad - Intel XPU: @IlyasMoutawwakil - Ascend NPU: @ivarflakstad Documentation: @stevhliu Research projects are not maintained and should be taken as is. -->

Changed files

  • src/transformers/modeling_utils.py (modified, +48/-0)
  • tests/utils/test_modeling_utils.py (modified, +35/-0)

PR #45156: Fix save_pretrained writing incorrect tie_word_embeddings=True config after PEFT merge

Description (problem / solution / changelog)

What does this PR do?

After the merge_and_unload() operation in PEFT, embed_tokens and lm_head become independent tensors with different values, but config.tie_word_embeddings remains True. The load-side already detects this using torch.equal in tie_weights() and skips tying, but save_pretrained() writes the incorrect config as-is.(tie_word_embeddings=True is already in a "semantic error" state in memory. Changing it to False is closer to the actual situation.)

issue # 45127——PEFT merge_and_unload() creates an inconsistent state (weights have been separated, but configuration has not been updated). impact: Downstream tools (GGUF converters, quantization scripts) trust this config directly, leading to silent weight corruption.

test

I wrote a simple script to reproduce the problem and tested it locally.

I ran make fix-repo and performed the following related tests: test_save_pretrained_auto_fixes_diverged_tied_embeddings(new test) test_tied_weights_are_not_tied_if_both_present_but_different(load-side ) test_tied_weights_are_tied_if_both_present_and_similar test_tied_weights_are_always_tied_from_config

The CI test is also passed.

<!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable -->

Fixes https://github.com/huggingface/transformers/issues/45127

Code Agent Policy

The Transformers repo is currently being overwhelmed by a large number of PRs and issue comments written by code agents. We are currently bottlenecked by our ability to review and respond to them. As a result, we ask that new users do not submit pure code agent PRs at this time. You may use code agents in drafting or to help you diagnose issues. We'd also ask autonomous "OpenClaw"-like agents not to open any PRs or issues for the moment.

PRs that appear to be fully agent-written will probably be closed without review, and we may block users who do this repeatedly or maliciously.

This is a rapidly-evolving situation that's causing significant shockwaves in the open-source community. As a result, this policy is likely to be updated regularly in the near future. For more information, please read CONTRIBUTING.md.

  • I confirm that this is not a pure code agent PR.

I used multiple AI models (Gemini, Claude, Kimi) to cross-validate edge cases and boundary conditions — different models behave differently around tied embeddings, which made CI failures harder to predict than expected. AI helped me locate these edge cases and I verified they weren't hallucinations.

I have read CONTRIBUTING.md, and tried my best to follow the instructions therein.

Before submitting

Who can review?

@CyrilVallez @BenjaminBossan

<!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**. Please tag fewer than 3 people. Models: - text models: @ArthurZucker @Cyrilvallez - vision models: @yonigozlan @molbap - audio models: @eustlb @ebezzam @vasqu - multimodal models: @zucchini-nlp - graph models: @clefourrier Library: - generate: @zucchini-nlp (visual-language models) or @gante (all others) - continuous batching: @remi-or @ArthurZucker @McPatate - pipelines: @Rocketknight1 - tokenizers: @ArthurZucker and @itazap - trainer: @SunMarc - attention: @vasqu @ArthurZucker @CyrilVallez - model loading (from pretrained, etc): @CyrilVallez - distributed: @3outeille @ArthurZucker - CIs: @ydshieh Integrations: - ray/raytune: @richardliaw, @amogkam - Big Model Inference: @SunMarc - quantization: @SunMarc - kernels: @drbh - peft: @BenjaminBossan @githubnemo Devices/Backends: - AMD ROCm: @ivarflakstad - Intel XPU: @IlyasMoutawwakil - Ascend NPU: @ivarflakstad Documentation: @stevhliu Research projects are not maintained and should be taken as is. -->

Changed files

  • src/transformers/modeling_utils.py (modified, +28/-0)
  • tests/utils/test_modeling_utils.py (modified, +22/-0)

PR #3126: Fix save_pretrained writing incorrect tie_word_embeddings=True config after PEFT merge

Description (problem / solution / changelog)

After the merge_and_unload() operation in PEFT, embed_tokens and lm_head become independent tensors with different values, but config.tie_word_embeddings remains True. The load-side already detects this using torch.equal in tie_weights() and skips tying, but save_pretrained() writes the incorrect config as-is.(tie_word_embeddings=True is already in a "semantic error" state in memory. Changing it to False is closer to the actual situation.)

Fixes https://github.com/huggingface/transformers/issues/45127

Although this issue was raised in the Transformers community, perhaps it would be better resolved in peft? It should have no negative impact on downstream transformers.

In transformers, load-side (tie_weights) check uses torch.equal for value verification. Here, I'm using is not to confirm whether the structure is separated. (I'm not sure if this is the best approach.) An additional test has been added; it can be removed if unnecessary.

Changed files

  • src/peft/tuners/tuners_utils.py (modified, +19/-0)
  • tests/test_decoder_models.py (modified, +16/-0)

Code Example

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "Qwen/Qwen1.5-0.5B" # Or any model with tied embeddings
new_vocab_size = 180500 # Extended for audio tokens

# 1. Load base model and resize embeddings
base_model = AutoModelForCausalLM.from_pretrained(base_model_id, device_map="cpu")
base_model.resize_token_embeddings(new_vocab_size)

# 2. Load trained LoRA adapter
# (Assuming a LoRA was trained on the extended vocab targeting embed_tokens and lm_head)
peft_model = PeftModel.from_pretrained(base_model, "./my_lora_adapter")

# 3. Merge and Save
merged_model = peft_model.merge_and_unload()
merged_model.save_pretrained("./merged_model", safe_serialization=True)

# 4. Reload and Generate -> BUG OCCURS HERE
# The model will now output repeating garbage tokens (e.g., [135528, 135528, 135528...])
reloaded_model = AutoModelForCausalLM.from_pretrained("./merged_model", device_map="cuda")

---

merged_model = peft_model.merge_and_unload()

# WORKAROUND: Untie embeddings before saving
merged_model.config.tie_word_embeddings = False 

merged_model.save_pretrained("./merged_model_fixed", safe_serialization=True)
RAW_BUFFERClick to expand / collapse

System Info

Name: transformers Version: 4.56.2 Python 3.11.15 Name: torch Version: 2.11.0+cu126

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Description: When extending the vocabulary size (e.g., adding audio/special tokens) on a base model that uses tied embeddings (config.tie_word_embeddings = True, such as Qwen), training a LoRA adapter, and subsequently merging it using peft_model.merge_and_unload(), the resulting saved model produces completely degraded outputs (repeating the same token infinitely) upon reloading. The issue occurs because peft correctly merges the separated embed_tokens and lm_head weights in memory, but when the merged model is saved, the config.tie_word_embeddings flag remains True. Upon reloading the merged model via AutoModelForCausalLM.from_pretrained(), the lm_head weights are overwritten by the embed_tokens weights (or vice versa) due to the tied config, completely destroying the newly trained weights for the extended vocabulary.

Reproduction Steps:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "Qwen/Qwen1.5-0.5B" # Or any model with tied embeddings
new_vocab_size = 180500 # Extended for audio tokens

# 1. Load base model and resize embeddings
base_model = AutoModelForCausalLM.from_pretrained(base_model_id, device_map="cpu")
base_model.resize_token_embeddings(new_vocab_size)

# 2. Load trained LoRA adapter
# (Assuming a LoRA was trained on the extended vocab targeting embed_tokens and lm_head)
peft_model = PeftModel.from_pretrained(base_model, "./my_lora_adapter")

# 3. Merge and Save
merged_model = peft_model.merge_and_unload()
merged_model.save_pretrained("./merged_model", safe_serialization=True)

# 4. Reload and Generate -> BUG OCCURS HERE
# The model will now output repeating garbage tokens (e.g., [135528, 135528, 135528...])
reloaded_model = AutoModelForCausalLM.from_pretrained("./merged_model", device_map="cuda")

Expected behavior

Expected behavior: The merged model should retain the learned weights for both embed_tokens and lm_head independently for the newly added tokens, and generate text/tokens correctly as it did before saving.

Workaround / Proposed Fix: Manually setting tie_word_embeddings = False before saving the merged model completely resolves the issue, as it forces transformers to save and load both layers independently:

merged_model = peft_model.merge_and_unload()

# WORKAROUND: Untie embeddings before saving
merged_model.config.tie_word_embeddings = False 

merged_model.save_pretrained("./merged_model_fixed", safe_serialization=True)

It would be great if peft's merge_and_unload() or transformers's save_pretrained() could automatically detect if vocabulary has been resized/untied during PEFT training and automatically handle the tie_word_embeddings flag to prevent silent model corruption upon saving.

extent analysis

Fix Plan

To resolve the issue, follow these steps:

  • Before saving the merged model, manually set tie_word_embeddings to False in the model's configuration:
merged_model = peft_model.merge_and_unload()
merged_model.config.tie_word_embeddings = False
merged_model.save_pretrained("./merged_model_fixed", safe_serialization=True)
  • Alternatively, you can modify the merge_and_unload method in the PeftModel class to automatically set tie_word_embeddings to False when the vocabulary has been resized:
class PeftModel:
    # ...
    def merge_and_unload(self):
        # ...
        if self.model.config.vocab_size != self.original_vocab_size:
            self.model.config.tie_word_embeddings = False
        # ...
  • To make this change more robust, you can also add a check to ensure that the tie_word_embeddings flag is set to False when saving the model:
class AutoModelForCausalLM:
    # ...
    def save_pretrained(self, save_directory, **kwargs):
        # ...
        if self.config.tie_word_embeddings and self.config.vocab_size != self.original_vocab_size:
            self.config.tie_word_embeddings = False
        # ...

Verification

To verify that the fix worked, reload the saved model and generate text using the generate method:

reloaded_model = AutoModelForCausalLM.from_pretrained("./merged_model_fixed", device_map="cuda")
output = reloaded_model.generate(input_ids, max_length=100)
print(output)

The output should be coherent and not repeat the same token infinitely.

Extra Tips

  • When working with models that have tied embeddings, be cautious when resizing the vocabulary or training adapters, as this can lead to silent model corruption.
  • Always verify the model's behavior after saving and reloading to ensure that the changes have been applied correctly.
  • Consider adding automated tests to detect issues like this in the future.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Expected behavior: The merged model should retain the learned weights for both embed_tokens and lm_head independently for the newly added tokens, and generate text/tokens correctly as it did before saving.

Workaround / Proposed Fix: Manually setting tie_word_embeddings = False before saving the merged model completely resolves the issue, as it forces transformers to save and load both layers independently:

merged_model = peft_model.merge_and_unload()

# WORKAROUND: Untie embeddings before saving
merged_model.config.tie_word_embeddings = False 

merged_model.save_pretrained("./merged_model_fixed", safe_serialization=True)

It would be great if peft's merge_and_unload() or transformers's save_pretrained() could automatically detect if vocabulary has been resized/untied during PEFT training and automatically handle the tie_word_embeddings flag to prevent silent model corruption upon saving.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING