transformers - 💡(How to fix) Fix Why max_shard_size in PreTrainedModel.save_pretrained() is 50GB? [1 comments, 2 participants]

transformers2026-03-06 02:51:48

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#44484•Fetched 2026-04-08 00:28:11

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Silencezong

Participants

Rocketknight1

Silencezong

Timeline (top)

closed ×1commented ×1labeled ×1

RAW_BUFFERClick to expand / collapse

System Info

In old version like 4.57.1, max_shard_size in PreTrainedModel.save_pretrained() is 5GB, but in new version, max_shard_size is '50GB'. Is it normal?

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

New version: def save_pretrained( self, save_directory: str | os.PathLike, is_main_process: bool = True, state_dict: dict | None = None, push_to_hub: bool = False, max_shard_size: int | str = "50GB", variant: str | None = None, token: str | bool | None = None, save_peft_format: bool = True, save_original_format: bool = True, **kwargs, ):

Old version: def save_pretrained( self, save_directory: Union[str, os.PathLike], is_main_process: bool = True, state_dict: Optional[dict] = None, save_function: Callable = torch.save, push_to_hub: bool = False, max_shard_size: Union[int, str] = "5GB", safe_serialization: bool = True, variant: Optional[str] = None, token: Optional[Union[str, bool]] = None, save_peft_format: bool = True, **kwargs, ):

Expected behavior

Fix this bug?

extent analysis

Fix Plan

Update max_shard_size Type

The root cause of the issue is the change in the type of max_shard_size from int to str in the new version of the save_pretrained method.

Step-by-Step Solution

Update the type of max_shard_size in the new version of the save_pretrained method to int to match the old version:

def save_pretrained( ... max_shard_size: int | str = 50 * 1024 * 1024 * 1024, # 50GB in bytes ... ):

2. **Remove quotes from string values** in the `max_shard_size` parameter when calling the `save_pretrained` method:
   ```python
save_pretrained(save_directory, max_shard_size=50 * 1024 * 1024 * 1024)  # 50GB in bytes

Update any hardcoded string values for max_shard_size to use the correct type:

max_shard_size = 50 * 1024 * 1024 * 1024 # 50GB in bytes save_pretrained(save_directory, max_shard_size=max_shard_size)


#### Verification

To verify that the fix worked, check that the `max_shard_size` parameter is correctly validated and used in the `save_pretrained` method. You can add print statements or use a debugger to inspect the value of `max_shard_size` within the method.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Fix this bug?

#api #ssr #installation #tensor shape #autograd error #LLM response #prompt template #agent execution #callback error #memory management

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - 💡(How to fix) Fix Why max_shard_size in PreTrainedModel.save_pretrained() is 50GB? [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

Update max_shard_size Type

Step-by-Step Solution

FAQ

Expected behavior

Still need to ship something?

TRENDING

transformers - 💡(How to fix) Fix Why max_shard_size in PreTrainedModel.save_pretrained() is 50GB? [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

Update max_shard_size Type

Step-by-Step Solution

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING