transformers - 💡(How to fix) Fix [Bug] FP8 save_pretrained moe [3 comments, 3 participants]

transformers2026-02-23 08:40:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#44222•Fetched 2026-04-08 00:29:44

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3mentioned ×2subscribed ×2cross-referenced ×1

Code Example

from transformers import AutoModelForCausalLM, AutoTokenizer, FineGrainedFP8Config

model_name = "Qwen/Qwen3-30B-A3B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
    quantization_config=FineGrainedFP8Config(modules_to_not_convert=['lm_head']),
)

model.save_pretrained('/root/test', safe_serialization=True, max_shard_size='5GB')

RAW_BUFFERClick to expand / collapse

System Info

Who can help?

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

from transformers import AutoModelForCausalLM, AutoTokenizer, FineGrainedFP8Config

model_name = "Qwen/Qwen3-30B-A3B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
    quantization_config=FineGrainedFP8Config(modules_to_not_convert=['lm_head']),
)

model.save_pretrained('/root/test', safe_serialization=True, max_shard_size='5GB')

Expected behavior

extent analysis

Fix Plan

1. Update `transformers` library to the latest version

Ensure you're running the latest version of the transformers library, which may have fixed the issue.

pip install --upgrade transformers

2. Use `save_pretrained` with `use_cache=False`

The save_pretrained method can sometimes cause issues when saving large models. Try setting use_cache=False to see if it resolves the problem.

model.save_pretrained('/root/test', safe_serialization=True, max_shard_size='5GB', use_cache=False)

3. Check for Out-of-Memory (OOM) errors

If the model is too large, it may cause OOM errors when saving. Try reducing the max_shard_size or using a more efficient storage solution.

model.save_pretrained('/root/test', safe_serialization=True, max_shard_size='2GB')

4. Verify the issue is resolved

Run the script again and check if the issue persists.

# Run the script with the updated parameters
python script.py

Verification

Run the script with the updated parameters and check if the issue is resolved.
Verify that the model is saved successfully without any errors.

Extra Tips

Always use the latest version of the transformers library to ensure you have the latest bug fixes and features.
Be cautious when working with large models, as they can cause Out-of-Memory (OOM) errors.
Consider using a more efficient storage solution, such as torch.save or joblib, to save large models.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #ssr #installation #tensor shape #autograd error #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - 💡(How to fix) Fix [Bug] FP8 save_pretrained moe [3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

1. Update `transformers` library to the latest version

2. Use `save_pretrained` with `use_cache=False`

3. Check for Out-of-Memory (OOM) errors

4. Verify the issue is resolved

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

TRENDING

transformers - 💡(How to fix) Fix [Bug] FP8 save_pretrained moe [3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

1. Update transformers library to the latest version

2. Use save_pretrained with use_cache=False

3. Check for Out-of-Memory (OOM) errors

4. Verify the issue is resolved

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING

1. Update `transformers` library to the latest version

2. Use `save_pretrained` with `use_cache=False`