transformers - ✅(Solved) Fix Support argumentless loading from Trainer checkpoints [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44450Fetched 2026-04-08 00:28:27
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
commented ×1cross-referenced ×1labeled ×1mentioned ×1

Fix Action

Fixed

PR fix notes

PR #45055: Save model config in Trainer checkpoints for non-PreTrainedModel models

Description (problem / solution / changelog)

What does this PR do?

When Trainer saves a checkpoint for a model that is not a PreTrainedModel (e.g. a custom nn.Module), it only saves the state dict but not the model config. This means Model.from_pretrained(ckpt_path) requires the caller to pass all the original init arguments again, which is inconvenient.

This PR saves the model's config.json in the checkpoint directory when the model has a config attribute, even when it falls through to the state-dict-only saving path. This enables argumentless loading from Trainer checkpoints:

# Before: needed init args
model = MyModel.from_pretrained(ckpt_path, **init_args)

# After: just works
model = MyModel.from_pretrained(ckpt_path)

The change is a 3-line addition in Trainer._save() that calls config.save_pretrained(output_dir) on the unwrapped model's config when available. This does not affect the existing PreTrainedModel path, which already saves config via save_pretrained().

Fixes #44450

Changed files

  • src/transformers/trainer.py (modified, +3/-0)

Code Example

├── model.safetensors
├── optimizer.pt
├── rng_state.pth
├── scheduler.pt
├── special_tokens_map.json
├── tokenizer.json
├── tokenizer_config.json
├── trainer_state.json
├── training_args.json
└── config.json  # Store this also
RAW_BUFFERClick to expand / collapse

Feature request

Trainer checkpoints don't include the config.json necessary to instantiate the model.

This means that if we want to use a specific checkpoint (e.g. use it for fine-tuning, evaluate on test set etc.) we need to know the init_args when calling .from_pretrained(ckpt_path, **init_args).

By saving the config.json under the checkpoint, .from_pretrained(ckpt_path) suffices.

├── model.safetensors
├── optimizer.pt
├── rng_state.pth
├── scheduler.pt
├── special_tokens_map.json
├── tokenizer.json
├── tokenizer_config.json
├── trainer_state.json
├── training_args.json
└── config.json  # Store this also

Motivation

It comes handy when a model needs to be loaded back for further fine-tuning/testing/debugging etc.

Your contribution

Not familiar with the HuggingFace code base, but I would be happy to do it as my first PR.

extent analysis

Fix Plan

Save config.json with Trainer Checkpoints

To include the config.json file in Trainer checkpoints, we need to modify the Trainer class to save the configuration file along with the other checkpoint files.

Step-by-Step Solution

  1. Import necessary libraries:

from transformers import Trainer

2. **Modify the `Trainer` class to save `config.json`**:
   ```python
class MyTrainer(Trainer):
    def save_state(self, output_dir):
        # Save config.json along with other checkpoint files
        config_path = os.path.join(output_dir, "config.json")
        with open(config_path, "w") as f:
            json.dump(self.config.to_dict(), f)
        super().save_state(output_dir)
  1. Update the Trainer instance to use the modified class:

trainer = MyTrainer(model, args)

4. **Train the model as usual**:
   ```python
trainer.train()

Verification

To verify that the fix worked, check that the config.json file is saved along with the other checkpoint files after training the model.

Extra Tips

  • Make sure to update the Trainer instance to use the modified MyTrainer class.
  • If you're using a custom Trainer class, you may need to modify it to include the config.json file.
  • Consider adding a test case to ensure that the config.json file is saved correctly.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - ✅(Solved) Fix Support argumentless loading from Trainer checkpoints [1 pull requests, 1 comments, 2 participants]