transformers - 💡(How to fix) Fix [v5] Issues with tied weights on translation models in v5 [2 comments, 2 participants]

transformers2026-03-25 20:28:38

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#45005•Fetched 2026-04-08 01:30:52

View on GitHub

Comments

Participants

Timeline

Reactions

Author

orthorhombic

Participants

Cyrilvallez

orthorhombic

Timeline (top)

subscribed ×3commented ×2mentioned ×2renamed ×2

Code Example

# %%
import torch
import typer
from transformers import MarianMTModel, MarianTokenizer

translations = [
    {
        "model": "Helsinki-NLP/opus-mt-fr-en",
        "input": "Bonjour",
        "expected": "Hello",
    },
    {
        "model": "Helsinki-NLP/opus-mt-es-en",
        "input": "Hola",
        "expected": "Hello",
    },
    {
        "model": "Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en",
        "input": "Hola",
        "expected": "Hello",
    },
]

def translate():

    print(f"Torch version: {torch.__version__}")
    print(f"CUDA Detected: {torch.version.cuda}")
    print(f"CUDA Available: {torch.cuda.is_available()}")

    for item in translations:
        model_name = item["model"]
        language_input = item["input"]
        expected_output = item["expected"]

        device = "cuda:0" if torch.cuda.is_available() else "cpu"
        tokenizer = MarianTokenizer.from_pretrained(model_name)
        model = MarianMTModel.from_pretrained(model_name)

        inputs = tokenizer(language_input, return_tensors="pt", padding=True).to(device)
        model = model.to(device)
        outputs = model.generate(**inputs)
        output = tokenizer.decode(outputs[0], skip_special_tokens=True)

        print(f"Using model {model_name}")
        print(f"Input '{language_input}'")
        print(f"Expected '{expected_output}'")
        print(f"Translated to '{output}'\n")

translate()

---

Torch version: 2.11.0+cu130
CUDA Detected: 13.0
CUDA Available: True
Using model Helsinki-NLP/opus-mt-fr-en
Input 'Bonjour'
Expected 'Hello'
Translated to 'Hello.'

Using model Helsinki-NLP/opus-mt-es-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello.'

Using model Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello there.'

---

Torch version: 2.11.0+cu130
CUDA Detected: 13.0
CUDA Available: True
Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 256/256 [00:00<00:00, 10362.90it/s]
Using model Helsinki-NLP/opus-mt-fr-en
Input 'Bonjour'
Expected 'Hello'
Translated to 'Hello.'

tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 44.0/44.0 [00:00<00:00, 319kB/s]
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
source.spm: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 826k/826k [00:00<00:00, 7.20MB/s]
target.spm: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 802k/802k [00:00<00:00, 46.0MB/s]
vocab.json: 1.59MB [00:00, 134MB/s]
config.json: 1.44kB [00:00, 7.95MB/s]
pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 312M/312M [00:10<00:00, 30.0MB/s]
Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 258/258 [00:00<00:00, 33026.02it/s]
model.safetensors:   0%|                                                                                                                                             | 0.00/312M [00:00<?, ?B/s]The tied weights mapping and config for this model specifies to tie model.shared.weight to model.decoder.embed_tokens.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
The tied weights mapping and config for this model specifies to tie model.shared.weight to model.encoder.embed_tokens.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
generation_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 293/293 [00:00<00:00, 2.68MB/s]
Using model Helsinki-NLP/opus-mt-es-en                                                                                                                                | 0.00/293 [00:00<?, ?B/s]
Input 'Hola'
Expected 'Hello'
Translated to 'Hello.'

Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 257/257 [00:00<00:00, 20577.98it/s]
model.safetensors:   0%|                                                                                                                                             | 0.00/312M [00:02<?, ?B/s]The tied weights mapping and config for this model specifies to tie model.shared.weight to lm_head.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
The tied weights mapping and config for this model specifies to tie model.shared.weight to model.decoder.embed_tokens.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
The tied weights mapping and config for this model specifies to tie model.shared.weight to model.encoder.embed_tokens.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
model.safetensors:   0%|                                                                                                                                             | 0.00/312M [00:04<?, ?B/s]Using model Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en
Input 'Hola'
Expected 'Hello'
Translated to '[                                          Saul Peruvian Woo- Nu Buy- Lac- fate  (  (--- (- (- (  --  - ( -- ( --- (  (  (  ( --------- formulations ( formulations ( formulations ( ( ( (  (  (  (  (  (  (  (  (  (  (  ( Johannesburg  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ('

---

Using model Helsinki-NLP/opus-mt-fr-en
Input 'Bonjour'
Expected 'Hello'
Translated to 'Hello.'

Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 258/258 [00:00<00:00, 30717.04it/s]
Using model Helsinki-NLP/opus-mt-es-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello.'

Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 257/257 [00:00<00:00, 30016.04it/s]
The tied weights mapping and config for this model specifies to tie model.shared.weight to lm_head.weight, but both are present in the checkpoints with different values, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning.
Using model Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en
Input 'Hola'
Expected 'Hello'
Translated to '[                                          Saul Peruvian Woo- Nu Buy- Lac- fate  (  (--- (- (- (  --  - ( -- ( --- (  (  (  ( --------- formulations ( formulations ( formulations ( ( ( (  (  (  (  (  (  (  (  (  (  (  ( Johannesburg  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ('

---

Using model Helsinki-NLP/opus-mt-fr-en
Input 'Bonjour'
Expected 'Hello'
Translated to 'Hello.'

Using model Helsinki-NLP/opus-mt-es-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello.'

Using model Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello there.'

RAW_BUFFERClick to expand / collapse

System Info

Not working:

transformers version: 5.3.0.dev0
Platform: Linux-6.8.0-101-generic-x86_64-with-glibc2.39
Python version: 3.14.2
Huggingface_hub version: 1.8.0
Safetensors version: 0.7.0
Accelerate version: not installed
Accelerate config: not found
DeepSpeed version: not installed
PyTorch version (accelerator?): 2.11.0+cu130 (CUDA)
Using distributed or parallel set-up in script?: no
Using GPU in script?: yes
GPU type: NVIDIA RTX 2000 Ada Generation Laptop GPU

Working:

transformers version: 4.57.6
Platform: Linux-6.8.0-101-generic-x86_64-with-glibc2.39
Python version: 3.14.2
Huggingface_hub version: 0.36.2
Safetensors version: 0.7.0
Accelerate version: not installed
Accelerate config: not found
DeepSpeed version: not installed
PyTorch version (accelerator?): 2.11.0+cu130 (CUDA)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: no
Using GPU in script?: yes
GPU type: NVIDIA RTX 2000 Ada Generation Laptop GPU

Note: additionally reproduced with different systems/gpus. Same result with CPU processing

Who can help?

@ArthurZucker @Cyrilvallez

I suspect this could be related to some of what was going on in #44466.

I'm seeing some models having issues when loaded with v5, but they work fine with 4.57.6. My gut feeling is there is something going on with the tied weights or how the weights are being loaded, but I'm not familiar enough with things to track it down. For some models I'm seeing nonsensical output when trying to translate with either gpu or cpu with transformers v5 while everything works fine in 4.57.6. Actually setting tie_word_embeddings=False breaks things for the working models and doesn't change the output for the already broken model.

In the output below, you can see 5.3.0 raises many warnings, but most of those seem to be resolved in the latest commit.

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Code to reproduce:


# %%
import torch
import typer
from transformers import MarianMTModel, MarianTokenizer

translations = [
    {
        "model": "Helsinki-NLP/opus-mt-fr-en",
        "input": "Bonjour",
        "expected": "Hello",
    },
    {
        "model": "Helsinki-NLP/opus-mt-es-en",
        "input": "Hola",
        "expected": "Hello",
    },
    {
        "model": "Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en",
        "input": "Hola",
        "expected": "Hello",
    },
]

def translate():

    print(f"Torch version: {torch.__version__}")
    print(f"CUDA Detected: {torch.version.cuda}")
    print(f"CUDA Available: {torch.cuda.is_available()}")

    for item in translations:
        model_name = item["model"]
        language_input = item["input"]
        expected_output = item["expected"]

        device = "cuda:0" if torch.cuda.is_available() else "cpu"
        tokenizer = MarianTokenizer.from_pretrained(model_name)
        model = MarianMTModel.from_pretrained(model_name)

        inputs = tokenizer(language_input, return_tensors="pt", padding=True).to(device)
        model = model.to(device)
        outputs = model.generate(**inputs)
        output = tokenizer.decode(outputs[0], skip_special_tokens=True)

        print(f"Using model {model_name}")
        print(f"Input '{language_input}'")
        print(f"Expected '{expected_output}'")
        print(f"Translated to '{output}'\n")

translate()

Results with 4.57.6:

Torch version: 2.11.0+cu130
CUDA Detected: 13.0
CUDA Available: True
Using model Helsinki-NLP/opus-mt-fr-en
Input 'Bonjour'
Expected 'Hello'
Translated to 'Hello.'

Using model Helsinki-NLP/opus-mt-es-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello.'

Using model Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello there.'

Results with 5.3.0:

Torch version: 2.11.0+cu130
CUDA Detected: 13.0
CUDA Available: True
Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 256/256 [00:00<00:00, 10362.90it/s]
Using model Helsinki-NLP/opus-mt-fr-en
Input 'Bonjour'
Expected 'Hello'
Translated to 'Hello.'

tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 44.0/44.0 [00:00<00:00, 319kB/s]
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
source.spm: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 826k/826k [00:00<00:00, 7.20MB/s]
target.spm: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 802k/802k [00:00<00:00, 46.0MB/s]
vocab.json: 1.59MB [00:00, 134MB/s]
config.json: 1.44kB [00:00, 7.95MB/s]
pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 312M/312M [00:10<00:00, 30.0MB/s]
Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 258/258 [00:00<00:00, 33026.02it/s]
model.safetensors:   0%|                                                                                                                                             | 0.00/312M [00:00<?, ?B/s]The tied weights mapping and config for this model specifies to tie model.shared.weight to model.decoder.embed_tokens.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
The tied weights mapping and config for this model specifies to tie model.shared.weight to model.encoder.embed_tokens.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
generation_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 293/293 [00:00<00:00, 2.68MB/s]
Using model Helsinki-NLP/opus-mt-es-en                                                                                                                                | 0.00/293 [00:00<?, ?B/s]
Input 'Hola'
Expected 'Hello'
Translated to 'Hello.'

Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 257/257 [00:00<00:00, 20577.98it/s]
model.safetensors:   0%|                                                                                                                                             | 0.00/312M [00:02<?, ?B/s]The tied weights mapping and config for this model specifies to tie model.shared.weight to lm_head.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
The tied weights mapping and config for this model specifies to tie model.shared.weight to model.decoder.embed_tokens.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
The tied weights mapping and config for this model specifies to tie model.shared.weight to model.encoder.embed_tokens.weight, but both are present in the checkpoints, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning
model.safetensors:   0%|                                                                                                                                             | 0.00/312M [00:04<?, ?B/s]Using model Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en
Input 'Hola'
Expected 'Hello'
Translated to '[                                          Saul Peruvian Woo- Nu Buy- Lac- fate  (  (--- (- (- (  --  - ( -- ( --- (  (  (  ( --------- formulations ( formulations ( formulations ( ( ( (  (  (  (  (  (  (  (  (  (  (  ( Johannesburg  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ('

Results with latest git+https://github.com/huggingface/transformers.git@35b005bba4de1d4b3c3789451adb5cf7469b1522 :

Using model Helsinki-NLP/opus-mt-fr-en
Input 'Bonjour'
Expected 'Hello'
Translated to 'Hello.'

Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 258/258 [00:00<00:00, 30717.04it/s]
Using model Helsinki-NLP/opus-mt-es-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello.'

Loading weights: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 257/257 [00:00<00:00, 30016.04it/s]
The tied weights mapping and config for this model specifies to tie model.shared.weight to lm_head.weight, but both are present in the checkpoints with different values, so we will NOT tie them. You should update the config with `tie_word_embeddings=False` to silence this warning.
Using model Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en
Input 'Hola'
Expected 'Hello'
Translated to '[                                          Saul Peruvian Woo- Nu Buy- Lac- fate  (  (--- (- (- (  --  - ( -- ( --- (  (  (  ( --------- formulations ( formulations ( formulations ( ( ( (  (  (  (  (  (  (  (  (  (  (  ( Johannesburg  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  (  ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ('

Expected behavior

It's expected that all three models load and output a meaningful translation without errors. See the results when run with 4.57.6 (reproduced from above)

Results with 4.57.6:

Using model Helsinki-NLP/opus-mt-fr-en
Input 'Bonjour'
Expected 'Hello'
Translated to 'Hello.'

Using model Helsinki-NLP/opus-mt-es-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello.'

Using model Helsinki-NLP/opus-mt-tc-big-cat_oci_spa-en
Input 'Hola'
Expected 'Hello'
Translated to 'Hello there.'

extent analysis

Fix Plan

To fix the issue, we need to update the model configuration to silence the warnings and ensure correct loading of the model weights.

Update the model configuration with tie_word_embeddings=False to prevent tying of weights.
Ensure that the model weights are loaded correctly by checking the model's configuration and weights.

Here's an example of how to update the model configuration:

from transformers import MarianMTModel, MarianTokenizer

# ...

for item in translations:
    model_name = item["model"]
    language_input = item["input"]
    expected_output = item["expected"]

    device = "cuda:0" if torch.cuda.is_available() else "cpu"
    tokenizer = MarianTokenizer.from_pretrained(model_name)
    model = MarianMTModel.from_pretrained(model_name, tie_word_embeddings=False)  # Update model config

    inputs = tokenizer(language_input, return_tensors="pt", padding=True).to(device)
    model = model.to(device)
    outputs = model.generate(**inputs)
    output = tokenizer.decode(outputs[0], skip_special_tokens=True)

    # ...

Verification

To verify that the fix worked, run the updated code and check the translation outputs. The outputs should be meaningful and match the expected results.

Extra Tips

Ensure that the model weights are compatible with the updated configuration.
If issues persist, try updating the transformers library to the latest version or checking the model's documentation for specific configuration requirements.
Be cautious when updating model configurations, as it may affect the model's performance or behavior. Always test and verify the results after making changes.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

It's expected that all three models load and output a meaningful translation without errors. See the results when run with 4.57.6 (reproduced from above)

#ISR setup #authentication setup #request error #file not found #serialization error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - 💡(How to fix) Fix [v5] Issues with tied weights on translation models in v5 [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Code to reproduce:

Results with 4.57.6:

Results with 5.3.0:

Results with latest git+https://github.com/huggingface/transformers.git@35b005bba4de1d4b3c3789451adb5cf7469b1522 :

Expected behavior

Results with 4.57.6:

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

TRENDING

transformers - 💡(How to fix) Fix [v5] Issues with tied weights on translation models in v5 [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Code to reproduce:

Results with 4.57.6:

Results with 5.3.0:

Results with latest git+https://github.com/huggingface/transformers.git@35b005bba4de1d4b3c3789451adb5cf7469b1522 :

Expected behavior

Results with 4.57.6:

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING