transformers - ✅(Solved) Fix AutoTokenizer.from_pretrained calls model_info() unconditionally in _patch_mistral_regex, breaks HF_HUB_OFFLINE mode [1 pull requests, 5 comments, 5 participants]

Q: Expected behavior

`AutoTokenizer.from_pretrained()` should work in offline mode (`HF_HUB_OFFLINE=1`) when the model is already cached locally. This worked in transformers 4.57.1.

transformers2026-03-19 05:36:56

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#44843•Fetched 2026-04-08 01:01:51

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×5mentioned ×3subscribed ×3cross-referenced ×1

Error Message

Full traceback:

Root Cause

In tokenization_utils_base.py, _patch_mistral_regex defines is_base_mistral() which calls huggingface_hub.model_info() — an API call — for every model, not just Mistral models. The call happens before any cache check or local_files_only guard:

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    model = model_info(model_id)  # <-- unconditional API call
    ...

if _is_local or is_base_mistral(pretrained_model_name_or_path):  # <-- called for ALL non-local models

The local_files_only parameter is passed to _patch_mistral_regex but is never used to guard the is_base_mistral() call.

Fix Action

Fix / Workaround

transformers/models/auto/tokenization_auto.py:1156, in from_pretrained
transformers/tokenization_utils_base.py:2113, in from_pretrained
transformers/tokenization_utils_base.py:2395, in _from_pretrained
transformers/tokenization_utils_base.py:2438, in _patch_mistral_regex
transformers/tokenization_utils_base.py:2432, in is_base_mistral
huggingface_hub/hf_api.py:2660, in model_info
→ OfflineModeIsEnabled

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    model = model_info(model_id)  # <-- unconditional API call
    ...

PR fix notes

PR #44923: fix: avoid unconditional model_info call in _patch_mistral_regex

Repository: huggingface/transformers
Author: prakhar-agarwal
State: open | merged: False
Link: https://github.com/huggingface/transformers/pull/44923

Description (problem / solution / changelog)

Addresses issue #44843. Verified with isolated repro logic.

Changes made: Updated the logic to properly identify local and offline scenarios upfront. Now, is_local is correctly set to True if:

is_offline_mode() is active.
The local_files_only flag is True.
The provided path is a local directory (os.path.isdir).

Changed files

src/transformers/tokenization_utils_tokenizers.py (modified, +3/-1)

Code Example

import os
from huggingface_hub import snapshot_download

# Step 1: Pre-download a non-Mistral model
snapshot_download("Qwen/Qwen3-0.6B")

# Step 2: Enable offline mode
os.environ["HF_HUB_OFFLINE"] = "1"

# Step 3: Load tokenizer — crashes even though model is fully cached
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")

---

huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach
https://huggingface.co/api/models/Qwen/Qwen3-0.6B: offline mode is enabled.

---

transformers/models/auto/tokenization_auto.py:1156, in from_pretrained
transformers/tokenization_utils_base.py:2113, in from_pretrained
transformers/tokenization_utils_base.py:2395, in _from_pretrained
transformers/tokenization_utils_base.py:2438, in _patch_mistral_regex
transformers/tokenization_utils_base.py:2432, in is_base_mistral
huggingface_hub/hf_api.py:2660, in model_info
→ OfflineModeIsEnabled

---

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    model = model_info(model_id)  # <-- unconditional API call
    ...

if _is_local or is_base_mistral(pretrained_model_name_or_path):  # <-- called for ALL non-local models

RAW_BUFFERClick to expand / collapse

System Info

transformers version: 4.57.3
huggingface_hub version: 0.36.2
Python: 3.12
OS: Linux (Ubuntu 24.04, inside NVIDIA container)

Who can help?

@ArthurZucker @itazap

Regression introduced in

PR #42389 ([Mistral Tokenizers] Fix tokenizer detection), included in v4.57.2 → v4.57.3.

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

import os
from huggingface_hub import snapshot_download

# Step 1: Pre-download a non-Mistral model
snapshot_download("Qwen/Qwen3-0.6B")

# Step 2: Enable offline mode
os.environ["HF_HUB_OFFLINE"] = "1"

# Step 3: Load tokenizer — crashes even though model is fully cached
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B")

This raises:

huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach
https://huggingface.co/api/models/Qwen/Qwen3-0.6B: offline mode is enabled.

Full traceback:

transformers/models/auto/tokenization_auto.py:1156, in from_pretrained
transformers/tokenization_utils_base.py:2113, in from_pretrained
transformers/tokenization_utils_base.py:2395, in _from_pretrained
transformers/tokenization_utils_base.py:2438, in _patch_mistral_regex
transformers/tokenization_utils_base.py:2432, in is_base_mistral
huggingface_hub/hf_api.py:2660, in model_info
→ OfflineModeIsEnabled

Expected behavior

AutoTokenizer.from_pretrained() should work in offline mode (HF_HUB_OFFLINE=1) when the model is already cached locally. This worked in transformers 4.57.1.

Root cause

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    model = model_info(model_id)  # <-- unconditional API call
    ...

if _is_local or is_base_mistral(pretrained_model_name_or_path):  # <-- called for ALL non-local models

The local_files_only parameter is passed to _patch_mistral_regex but is never used to guard the is_base_mistral() call.

Suggested fix

The is_base_mistral() call should either:

Be wrapped in a try/except that catches OfflineModeIsEnabled and returns False (safe default — if we can't reach the API, assume it's not a Mistral model), or
Be skipped when local_files_only=True or HF_HUB_OFFLINE=1 is set

Impact

This breaks any CI/CD pipeline or air-gapped environment that:

Pre-downloads models with snapshot_download()
Sets HF_HUB_OFFLINE=1 to prevent network access during test execution
Loads tokenizers via AutoTokenizer.from_pretrained()

This pattern is common in ML CI pipelines (we hit this in NVIDIA Dynamo's CI with TensorRT-LLM).

extent analysis

Fix Plan

To fix the issue, we need to modify the is_base_mistral() function to handle offline mode. We can achieve this by wrapping the model_info() call in a try-except block to catch the OfflineModeIsEnabled exception.

Step-by-Step Solution

Modify the tokenization_utils_base.py file to include a try-except block around the model_info() call in the is_base_mistral() function.
Use the local_files_only parameter to guard the is_base_mistral() call when offline mode is enabled.

Example code:

# tokenization_utils_base.py, _patch_mistral_regex
def is_base_mistral(model_id: str) -> bool:
    try:
        model = model_info(model_id)
        # ... rest of the function remains the same
    except OfflineModeIsEnabled:
        # If offline mode is enabled, assume it's not a Mistral model
        return False

Alternatively, you can skip the is_base_mistral() call when local_files_only=True or HF_HUB_OFFLINE=1 is set:

# tokenization_utils_base.py, _patch_mistral_regex
if _is_local or (not local_files_only and not os.environ.get("HF_HUB_OFFLINE") == "1" and is_base_mistral(pretrained_model_name_or_path)):
    # ... rest of the function remains the same

Verification

To verify that the fix worked, you can run the reproduction script again with the modified tokenization_utils_base.py file. The script should no longer raise the OfflineModeIsEnabled exception when loading the tokenizer in offline mode.

Extra Tips

Make sure to update the transformers library to the latest version after applying the fix.
If you're using a CI/CD pipeline, ensure that the HF_HUB_OFFLINE=1 environment variable is set correctly to prevent network access during test execution.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

AutoTokenizer.from_pretrained() should work in offline mode (HF_HUB_OFFLINE=1) when the model is already cached locally. This worked in transformers 4.57.1.

#api #ssr #installation #tensor shape #autograd error #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

transformers - ✅(Solved) Fix AutoTokenizer.from_pretrained calls model_info() unconditionally in _patch_mistral_regex, breaks HF_HUB_OFFLINE mode [1 pull requests, 5 comments, 5 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #44923: fix: avoid unconditional model_info call in _patch_mistral_regex

Description (problem / solution / changelog)

Changed files

Code Example

System Info

Who can help?

Regression introduced in

Information

Tasks

Reproduction

Expected behavior

Root cause

Suggested fix

Impact

extent analysis

Fix Plan

Step-by-Step Solution

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING