transformers - ✅(Solved) Fix AutoTokenizer.from_pretrained calls model_info() unconditionally in _patch_mistral_regex, breaks HF_HUB_OFFLINE mode [1 pull requests, 5 comments, 5 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44843Fetched 2026-04-08 01:01:51
View on GitHub
Comments
5
Participants
5
Timeline
13
Reactions
1
Author
Timeline (top)
commented ×5mentioned ×3subscribed ×3cross-referenced ×1

Error Message

Full traceback:

Root Cause

In tokenization_utils_base.py, _patch_mistral_regex defines is_base_mistral() which calls huggingface_hub.model_info() — an API call — for every model, not just Mistral models. The call happens before any cache check or local_files_only guard:

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    model = model_info(model_id)  # <-- unconditional API call
    ...

if _is_local or is_base_mistral(pretrained_model_name_or_path):  # <-- called for ALL non-local models

The local_files_only parameter is passed to _patch_mistral_regex but is never used to guard the is_base_mistral() call.

Fix Action

Fix / Workaround

transformers/models/auto/tokenization_auto.py:1156, in from_pretrained
transformers/tokenization_utils_base.py:2113, in from_pretrained
transformers/tokenization_utils_base.py:2395, in _from_pretrained
transformers/tokenization_utils_base.py:2438, in _patch_mistral_regex
transformers/tokenization_utils_base.py:2432, in is_base_mistral
huggingface_hub/hf_api.py:2660, in model_info
→ OfflineModeIsEnabled

In tokenization_utils_base.py, _patch_mistral_regex defines is_base_mistral() which calls huggingface_hub.model_info() — an API call — for every model, not just Mistral models. The call happens before any cache check or local_files_only guard:

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    model = model_info(model_id)  # <-- unconditional API call
    ...

PR fix notes

PR #44923: fix: avoid unconditional model_info call in _patch_mistral_regex

Description (problem / solution / changelog)

Addresses issue #44843. Verified with isolated repro logic.

Changes made: Updated the logic to properly identify local and offline scenarios upfront. Now, is_local is correctly set to True if:

  1. is_offline_mode() is active.
  2. The local_files_only flag is True.
  3. The provided path is a local directory (os.path.isdir).

Changed files

  • src/transformers/tokenization_utils_tokenizers.py (modified, +3/-1)

Code Example

import os
from huggingface_hub import snapshot_download

# Step 1: Pre-download a non-Mistral model
snapshot_download("Qwen/Qwen3-0.6B")

# Step 2: Enable offline mode
os.environ["HF_HUB_OFFLINE"] = "1"

# Step 3: Load tokenizer — crashes even though model is fully cached
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")

---

huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach
https://huggingface.co/api/models/Qwen/Qwen3-0.6B: offline mode is enabled.

---

transformers/models/auto/tokenization_auto.py:1156, in from_pretrained
transformers/tokenization_utils_base.py:2113, in from_pretrained
transformers/tokenization_utils_base.py:2395, in _from_pretrained
transformers/tokenization_utils_base.py:2438, in _patch_mistral_regex
transformers/tokenization_utils_base.py:2432, in is_base_mistral
huggingface_hub/hf_api.py:2660, in model_info
OfflineModeIsEnabled

---

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    model = model_info(model_id)  # <-- unconditional API call
    ...

if _is_local or is_base_mistral(pretrained_model_name_or_path):  # <-- called for ALL non-local models
RAW_BUFFERClick to expand / collapse

System Info

  • transformers version: 4.57.3
  • huggingface_hub version: 0.36.2
  • Python: 3.12
  • OS: Linux (Ubuntu 24.04, inside NVIDIA container)

Who can help?

@ArthurZucker @itazap

Regression introduced in

PR #42389 ([Mistral Tokenizers] Fix tokenizer detection), included in v4.57.2 → v4.57.3.

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

import os
from huggingface_hub import snapshot_download

# Step 1: Pre-download a non-Mistral model
snapshot_download("Qwen/Qwen3-0.6B")

# Step 2: Enable offline mode
os.environ["HF_HUB_OFFLINE"] = "1"

# Step 3: Load tokenizer — crashes even though model is fully cached
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")

This raises:

huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach
https://huggingface.co/api/models/Qwen/Qwen3-0.6B: offline mode is enabled.

Full traceback:

transformers/models/auto/tokenization_auto.py:1156, in from_pretrained
transformers/tokenization_utils_base.py:2113, in from_pretrained
transformers/tokenization_utils_base.py:2395, in _from_pretrained
transformers/tokenization_utils_base.py:2438, in _patch_mistral_regex
transformers/tokenization_utils_base.py:2432, in is_base_mistral
huggingface_hub/hf_api.py:2660, in model_info
→ OfflineModeIsEnabled

Expected behavior

AutoTokenizer.from_pretrained() should work in offline mode (HF_HUB_OFFLINE=1) when the model is already cached locally. This worked in transformers 4.57.1.

Root cause

In tokenization_utils_base.py, _patch_mistral_regex defines is_base_mistral() which calls huggingface_hub.model_info() — an API call — for every model, not just Mistral models. The call happens before any cache check or local_files_only guard:

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    model = model_info(model_id)  # <-- unconditional API call
    ...

if _is_local or is_base_mistral(pretrained_model_name_or_path):  # <-- called for ALL non-local models

The local_files_only parameter is passed to _patch_mistral_regex but is never used to guard the is_base_mistral() call.

Suggested fix

The is_base_mistral() call should either:

  1. Be wrapped in a try/except that catches OfflineModeIsEnabled and returns False (safe default — if we can't reach the API, assume it's not a Mistral model), or
  2. Be skipped when local_files_only=True or HF_HUB_OFFLINE=1 is set

Impact

This breaks any CI/CD pipeline or air-gapped environment that:

  1. Pre-downloads models with snapshot_download()
  2. Sets HF_HUB_OFFLINE=1 to prevent network access during test execution
  3. Loads tokenizers via AutoTokenizer.from_pretrained()

This pattern is common in ML CI pipelines (we hit this in NVIDIA Dynamo's CI with TensorRT-LLM).

extent analysis

Fix Plan

To fix the issue, we need to modify the is_base_mistral() function to handle offline mode. We can achieve this by wrapping the model_info() call in a try-except block to catch the OfflineModeIsEnabled exception.

Step-by-Step Solution

  • Modify the tokenization_utils_base.py file to include a try-except block around the model_info() call in the is_base_mistral() function.
  • Use the local_files_only parameter to guard the is_base_mistral() call when offline mode is enabled.

Example code:

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    try:
        model = model_info(model_id)
        # ... rest of the function remains the same
    except OfflineModeIsEnabled:
        # If offline mode is enabled, assume it's not a Mistral model
        return False

Alternatively, you can skip the is_base_mistral() call when local_files_only=True or HF_HUB_OFFLINE=1 is set:

# tokenization_utils_base.py, _patch_mistral_regex
if _is_local or (not local_files_only and not os.environ.get("HF_HUB_OFFLINE") == "1" and is_base_mistral(pretrained_model_name_or_path)):
    # ... rest of the function remains the same

Verification

To verify that the fix worked, you can run the reproduction script again with the modified tokenization_utils_base.py file. The script should no longer raise the OfflineModeIsEnabled exception when loading the tokenizer in offline mode.

Extra Tips

  • Make sure to update the transformers library to the latest version after applying the fix.
  • If you're using a CI/CD pipeline, ensure that the HF_HUB_OFFLINE=1 environment variable is set correctly to prevent network access during test execution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

AutoTokenizer.from_pretrained() should work in offline mode (HF_HUB_OFFLINE=1) when the model is already cached locally. This worked in transformers 4.57.1.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING