transformers - ✅(Solved) Fix trainer.evaluate() fails after trainer.train() [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44936Fetched 2026-04-08 01:16:53
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
commented ×1cross-referenced ×1labeled ×1mentioned ×1

Error Message

  1. Run the notebook.

Error / Traceback

Fix Action

Fixed

PR fix notes

PR #44949: Fix: NotebookProgressCallback crash when evaluating with the Trainer

Description (problem / solution / changelog)

What does this PR do?

Fixes #44936

This PR fixes an issue with NotebookProgressCallback in the Trainer where calling evaluate() before or after training would crash due to the training tracker being None. The callback now properly handles evaluation even if training has not yet started or if it has already finished, ensuring metrics can be computed and displayed.

Previously, the on_evaluate method assumed that self.training_tracker was always initialized, but:

  • Before training: self.training_tracker has not being initialised by on_train_begin yet.
  • After training: on_train_end sets self.training_tracker to None, so calling on_evaluate afterwards would fail.

Fix: on_evaluate now checks whether self.training_tracker exists before using it, and safely handles cases where it is None. This prevents crashes and ensures evaluation can run regardless of training state.

Additionally, new unit tests were added to ensure that evaluation works in this scenario, and existing notebook callback tests were updated to cover this case. This improves robustness of notebook-based workflows, especially in Jupyter or Colab environments.

Code Agent Policy

  • I confirm that this is not a pure code agent PR.

Before submitting

Who can review?

@SunMarc

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

Changed files

  • src/transformers/utils/notebook.py (modified, +4/-1)
  • tests/trainer/test_trainer_callback.py (modified, +76/-0)

Code Example

5. Run the notebook.

---

### Error / Traceback
RAW_BUFFERClick to expand / collapse

System Info

  • transformers version: 5.3.0
  • Platform: Windows-11-10.0.26200-SP0
  • Python version: 3.13.0
  • Huggingface_hub version: 1.7.2
  • Safetensors version: 0.7.0
  • Accelerate version: 1.13.0
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (accelerator?): 2.10.0+cpu (NA)
  • Using distributed or parallel set-up in script?: no

Who can help?

@SunMarc

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction


Steps to Reproduce

  1. Download the notebook: https://github.com/huggingface/notebooks/blob/main/transformers_doc/training.ipynb

  2. Create a virtual environment and install packages from the first cell, plus scikit-learn (required later for evaluation).

  3. Add trainer.evaluate() after trainer.train().

  4. Use the following code:

from transformers import Trainer

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=small_train,
    eval_dataset=small_eval,
    compute_metrics=compute_metrics,
)

trainer.train()
trainer.evaluate()
  1. Run the notebook.

Error / Traceback

RuntimeError                              Traceback (most recent call last)
Cell In[8], line 11
      3 trainer = Trainer(
      4     model=model,
      5     args=training_args,
   (...)      8     compute_metrics=compute_metrics,
      9 )
     10 trainer.train()
---> 11 trainer.evaluate()

File /path/to/venv/lib/site-packages/transformers/trainer.py:2602, in Trainer.evaluate(self, eval_dataset, ignore_keys, metric_key_prefix)
   2599 if DebugOption.TPU_METRICS_DEBUG in self.args.debug:
   2600     xm.master_print(met.metrics_report())
-> 2602 self.control = self.callback_handler.on_evaluate(self.args, self.state, self.control, output.metrics)

File /path/to/venv/lib/site-packages/transformers/trainer_callback.py:524, in CallbackHandler.on_evaluate(self, args, state, control, metrics)
    522 def on_evaluate(self, args: TrainingArguments, state: TrainerState, control: TrainerControl, metrics):
    523     control.should_evaluate = False
--> 524     return self.call_event("on_evaluate", args, state, control, metrics=metrics)

File /path/to/venv/lib/site-packages/transformers/trainer_callback.py:545, in CallbackHandler.call_event(self, event, args, state, control, **kwargs)
    543 def call_event(self, event, args, state, control, **kwargs):
...
     30     if x is None:
---> 31         raise RuntimeError(msg)
     32     return x

RuntimeError: on_train_begin must be called before on_evaluate

Expected behavior

I would expect trainer.evaluate() to give me the evaluation results instead of an error

extent analysis

Fix Plan

The issue arises from calling trainer.evaluate() without properly initializing the trainer's state. To fix this, we need to ensure that the on_train_begin method is called before on_evaluate.

Here are the steps to resolve the issue:

  • Initialize the trainer with the required arguments.
  • Call trainer.train() to start the training process.
  • After training, call trainer.evaluate() to evaluate the model.

However, the error message suggests that the on_train_begin method is not being called. We can fix this by calling trainer.state.is_initialized = True before trainer.evaluate().

Here's an example code snippet:

from transformers import Trainer

# Initialize the trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=small_train,
    eval_dataset=small_eval,
    compute_metrics=compute_metrics,
)

# Train the model
trainer.train()

# Initialize the trainer state
trainer.state.is_initialized = True

# Evaluate the model
trainer.evaluate()

Alternatively, you can also try calling trainer.state.is_initialized = True before trainer.train() to ensure that the state is initialized before training.

Verification

To verify that the fix worked, you can check the output of trainer.evaluate() to see if it returns the expected evaluation results.

Extra Tips

Make sure to check the documentation for the Trainer class and its methods to understand the correct usage and initialization of the trainer state. Additionally, you can try printing out the trainer.state object to see its current state and identify any potential issues.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

I would expect trainer.evaluate() to give me the evaluation results instead of an error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING