transformers - ✅(Solved) Fix Support argumentless loading from Trainer checkpoints [1 pull requests, 1 comments, 2 participants]

transformers2026-03-05 02:08:23

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#44450•Fetched 2026-04-08 00:28:27

View on GitHub

Comments

Participants

Timeline

Reactions

Author

adosar

Participants

adosar

Rocketknight1

Timeline (top)

commented ×1cross-referenced ×1labeled ×1mentioned ×1

Fix Action

Fixed

Fixed by PR: Save model config in Trainer checkpoints for non-PreTrainedModel models (https://github.com/huggingface/transformers/pull/45055)

PR fix notes

PR #45055: Save model config in Trainer checkpoints for non-PreTrainedModel models

Repository: huggingface/transformers
Author: vasanthrpjan1-boop
State: open | merged: False
Link: https://github.com/huggingface/transformers/pull/45055

Description (problem / solution / changelog)

What does this PR do?

When Trainer saves a checkpoint for a model that is not a PreTrainedModel (e.g. a custom nn.Module), it only saves the state dict but not the model config. This means Model.from_pretrained(ckpt_path) requires the caller to pass all the original init arguments again, which is inconvenient.

This PR saves the model's config.json in the checkpoint directory when the model has a config attribute, even when it falls through to the state-dict-only saving path. This enables argumentless loading from Trainer checkpoints:

# Before: needed init args
model = MyModel.from_pretrained(ckpt_path, **init_args)

# After: just works
model = MyModel.from_pretrained(ckpt_path)

The change is a 3-line addition in Trainer._save() that calls config.save_pretrained(output_dir) on the unwrapped model's config when available. This does not affect the existing PreTrainedModel path, which already saves config via save_pretrained().

Fixes #44450

Changed files

src/transformers/trainer.py (modified, +3/-0)

Code Example

├── model.safetensors
├── optimizer.pt
├── rng_state.pth
├── scheduler.pt
├── special_tokens_map.json
├── tokenizer.json
├── tokenizer_config.json
├── trainer_state.json
├── training_args.json
└── config.json  # Store this also

RAW_BUFFERClick to expand / collapse

Feature request

Trainer checkpoints don't include the config.json necessary to instantiate the model.

This means that if we want to use a specific checkpoint (e.g. use it for fine-tuning, evaluate on test set etc.) we need to know the init_args when calling .from_pretrained(ckpt_path, **init_args).

By saving the config.json under the checkpoint, .from_pretrained(ckpt_path) suffices.

├── model.safetensors
├── optimizer.pt
├── rng_state.pth
├── scheduler.pt
├── special_tokens_map.json
├── tokenizer.json
├── tokenizer_config.json
├── trainer_state.json
├── training_args.json
└── config.json  # Store this also

Motivation

It comes handy when a model needs to be loaded back for further fine-tuning/testing/debugging etc.

Your contribution

Not familiar with the HuggingFace code base, but I would be happy to do it as my first PR.

extent analysis

Fix Plan

Save `config.json` with Trainer Checkpoints

To include the config.json file in Trainer checkpoints, we need to modify the Trainer class to save the configuration file along with the other checkpoint files.

Step-by-Step Solution

Import necessary libraries:

from transformers import Trainer

2. **Modify the `Trainer` class to save `config.json`**:
   ```python
class MyTrainer(Trainer):
    def save_state(self, output_dir):
        # Save config.json along with other checkpoint files
        config_path = os.path.join(output_dir, "config.json")
        with open(config_path, "w") as f:
            json.dump(self.config.to_dict(), f)
        super().save_state(output_dir)

Update the Trainer instance to use the modified class:

trainer = MyTrainer(model, args)

4. **Train the model as usual**:
   ```python
trainer.train()

Verification

To verify that the fix worked, check that the config.json file is saved along with the other checkpoint files after training the model.

Extra Tips

Make sure to update the Trainer instance to use the modified MyTrainer class.
If you're using a custom Trainer class, you may need to modify it to include the config.json file.
Consider adding a test case to ensure that the config.json file is saved correctly.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #response parsing #generation error #database connection #vector store #embedding generation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - ✅(Solved) Fix Support argumentless loading from Trainer checkpoints [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #45055: Save model config in Trainer checkpoints for non-PreTrainedModel models

Description (problem / solution / changelog)

What does this PR do?

Changed files

Code Example

Feature request

Motivation

Your contribution

extent analysis

Fix Plan

Save `config.json` with Trainer Checkpoints

Step-by-Step Solution

Verification

Extra Tips

Still need to ship something?

TRENDING

transformers - ✅(Solved) Fix Support argumentless loading from Trainer checkpoints [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #45055: Save model config in Trainer checkpoints for non-PreTrainedModel models

Description (problem / solution / changelog)

What does this PR do?

Changed files

Code Example

Feature request

Motivation

Your contribution

extent analysis

Fix Plan

Save config.json with Trainer Checkpoints

Step-by-Step Solution

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Save `config.json` with Trainer Checkpoints