transformers - ✅(Solved) Fix TypeError: couldn't find storage object Float8_e4m3fnStorage [2 pull requests, 6 comments, 6 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44589Fetched 2026-04-08 00:27:31
View on GitHub
Comments
6
Participants
6
Timeline
20
Reactions
0
Author
Timeline (top)
commented ×6mentioned ×4subscribed ×4cross-referenced ×3

Error Message

File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 474, in init self.load_model_and_processor(model_id_and_revision) File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 1876, in load_model_and_processor model, processor = self._load_model_and_data_processor(model_id_and_revision) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 1850, in _load_model_and_data_processor model = architecture.from_pretrained(model_id, **model_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4096, in from_pretrained with ContextManagers(model_init_context): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 532, in enter self.stack.enter_context(context_manager) File "/home/xinhe/.local/share/uv/python/cpython-3.12.13-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 526, in enter_context result = _enter(cm) ^^^^^^^^^^ File "/home/xinhe/.local/share/uv/python/cpython-3.12.13-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 137, in enter return next(self.gen) ^^^^^^^^^^^^^^ File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 240, in local_torch_dtype torch.set_default_dtype(dtype) File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/torch/init.py", line 1358, in set_default_dtype _C._set_default_dtype(d) TypeError: couldn't find storage object Float8_e4m3fnStorage

Fix Action

Fixed

PR fix notes

PR #44596: Fix TypeError when loading float8 models by falling back to bfloat16 in local_torch_dtype

Description (problem / solution / changelog)

Fix TypeError when loading float8 models by falling back to bfloat16 in local_torch_dtype

What does this PR do?

When loading FP8 models (e.g. Qwen/Qwen3.5-35B-A3B-FP8) with dtype="auto", the auto-detected dtype from checkpoint weights can be torch.float8_e4m3fn. This dtype flows to local_torch_dtype() which calls torch.set_default_dtype(), but PyTorch does not support float8 types as default dtype, causing: TypeError: couldn't find storage object Float8_e4m3fnStorage

This happens when:

  • The top-level config has no dtype set (common with composite models where dtype is only in a sub-config)
  • _get_dtype() auto-detects torch.float8_e4m3fn from the checkpoint weights
  • FineGrainedFP8HfQuantizer doesn't override update_dtype(), so it can't intercept this

This PR adds a check in local_torch_dtype() to fall back to torch.bfloat16 when a float8 dtype is encountered. This only affects model skeleton initialization (set_default_dtype); actual float8 weights are still loaded correctly downstream via _load_pretrained_model.

Also adds a unit test to verify the fallback behavior for both float8_e4m3fn and float8_e5m2.

Fixes #44589

Before submitting

Who can review?

@CyrilVallez (model loading / from_pretrained) @SunMarc (quantization)

Changed files

  • src/transformers/modeling_utils.py (modified, +5/-0)
  • tests/utils/test_modeling_utils.py (modified, +21/-0)

PR #44616: fix: add Float8 dtype fallback in modeling_utils.py

Description (problem / solution / changelog)

Summary

Add fallback to bfloat16 when Float8 dtype fails to set, preventing TypeError when loading FP8 models on PyTorch builds without Float8_e4m3fnStorage support.

Root Cause

torch.set_default_dtype(dtype) raises TypeError: couldn't find storage object Float8_e4m3fnStorage when Float8 is not available.

Fix

Wrap in try/except and fall back to bfloat16 when the error indicates Float8.

Fixes #44589

Changed files

  • src/transformers/modeling_utils.py (modified, +7/-1)

Code Example

File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 474, in __init__
    self.load_model_and_processor(model_id_and_revision)
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 1876, in load_model_and_processor
    model, processor = self._load_model_and_data_processor(model_id_and_revision)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 1850, in _load_model_and_data_processor
    model = architecture.from_pretrained(model_id, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4096, in from_pretrained
    with ContextManagers(model_init_context):
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 532, in __enter__
    self.stack.enter_context(context_manager)
  File "/home/xinhe/.local/share/uv/python/cpython-3.12.13-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 526, in enter_context
    result = _enter(cm)
             ^^^^^^^^^^
  File "/home/xinhe/.local/share/uv/python/cpython-3.12.13-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 240, in local_torch_dtype
    torch.set_default_dtype(dtype)
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/torch/__init__.py", line 1358, in set_default_dtype
    _C._set_default_dtype(d)
TypeError: couldn't find storage object Float8_e4m3fnStorage
RAW_BUFFERClick to expand / collapse

System Info

  • transformers version: 5.3.0.dev0
  • Platform: Linux-6.8.0-101-generic-x86_64-with-glibc2.35
  • Python version: 3.12.13
  • Huggingface_hub version: 1.6.0
  • Safetensors version: 0.7.0
  • Accelerate version: 1.13.0
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (accelerator?): 2.10.0+cu128 (CUDA)
  • Using distributed or parallel set-up in script?: <fill in>
  • Using GPU in script?: <fill in>
  • GPU type: NVIDIA A100-SXM4-80GB

Who can help?

@CyrilVallez

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

transformers serve --force-model Qwen/Qwen3.5-35B-A3B-FP8 --port 8000 --continuous-batching

  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 474, in __init__
    self.load_model_and_processor(model_id_and_revision)
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 1876, in load_model_and_processor
    model, processor = self._load_model_and_data_processor(model_id_and_revision)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/cli/serve.py", line 1850, in _load_model_and_data_processor
    model = architecture.from_pretrained(model_id, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4096, in from_pretrained
    with ContextManagers(model_init_context):
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 532, in __enter__
    self.stack.enter_context(context_manager)
  File "/home/xinhe/.local/share/uv/python/cpython-3.12.13-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 526, in enter_context
    result = _enter(cm)
             ^^^^^^^^^^
  File "/home/xinhe/.local/share/uv/python/cpython-3.12.13-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 240, in local_torch_dtype
    torch.set_default_dtype(dtype)
  File "/home/xinhe/auto-round/.venv/lib/python3.12/site-packages/torch/__init__.py", line 1358, in set_default_dtype
    _C._set_default_dtype(d)
TypeError: couldn't find storage object Float8_e4m3fnStorage

Expected behavior

The FP8 model should be loaded correctly.

extent analysis

Fix Plan

Step 1: Update Accelerate Configuration

The error message suggests that the Float8_e4m3fnStorage object is not found, which is related to the Accelerate library. We need to update the Accelerate configuration to use the correct storage object.

# Create a new file for the Accelerate configuration
echo "{
    \"storage\": {
        \"type\": \"Float16_e4m3fnStorage\"
    }
}" > ~/.accelerate/config.json

Step 2: Update PyTorch Configuration

We also need to update the PyTorch configuration to use the correct storage object.

import torch
torch.set_default_dtype(torch.float16)

Step 3: Update Model Loading Code

We need to update the model loading code to use the correct storage object.

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    "Qwen/Qwen3.5-35B-A3B-FP8",
    torch_dtype=torch.float16
)

Step 4: Restart the Serve Command

After updating the configuration and code, restart the serve command.

transformers serve --force-model Qwen/Qwen3.5-35B-A3B-FP8 --port 8000 --continuous-batching

Verification

To verify that the fix worked, check that the model is loaded correctly and the serve command runs without errors. You can also use the transformers library to load the model and check its properties.

import torch
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    "Qwen/Qwen3.5-35B-A3B-FP8",
    torch_dtype=torch.float16
)

print(model.config)
print(model.state_dict())

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The FP8 model should be loaded correctly.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - ✅(Solved) Fix TypeError: couldn't find storage object Float8_e4m3fnStorage [2 pull requests, 6 comments, 6 participants]