`from_pretrained` knows to unwrap (`config = config.get_text_config()`) before dispatching, see [`auto_factory.py:L383-L396`](https://github.com/huggingface/transformers/blob/v5.6.2/src/transformers/models/auto/auto_factory.py#L383-L396). However, `from_config` does not have this unwrap, see [`auto_factory.py:L239-L241`](https://github.com/huggingface/transformers/blob/v5.6.2/src/transformers/models/auto/auto_factory.py#L239-L241). So the request is symmetry inside `auto_factory.py`: the same unwrap that exists for `from_pretrained` should exist for `from_config`. Related issues: - Analogous case for Gemma 3: https://github.com/huggingface/transformers/issues/36683 - Related vLLM issue: https://github.com/vllm-project/vllm/issues/36236

transformers - ✅(Solved) Fix `AutoModelForCausalLM.from_config` does not unwrap `text_config` for composite Qwen 3.5 and 3.6 multimodal configs [1 pull requests, 1 participants]

transformers2026-05-03 18:54:41

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#45759•Fetched 2026-05-04 04:58:11

View on GitHub

Comments

Participants

Timeline

Reactions

Author

jamesbraza

Participants

jamesbraza

Timeline (top)

mentioned ×3subscribed ×3cross-referenced ×1labeled ×1

Error Message

Traceback (most recent call last): File "repro.py", ... model = AutoModelForCausalLM.from_config(config) File ".../transformers/models/auto/auto_factory.py", line 241, in from_config return model_class._from_config(config, **kwargs) File ".../transformers/modeling_utils.py", line 1542, in _from_config model = cls(config, **kwargs) File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1906, in init self.model = Qwen3_5MoeTextModel(config) File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1339, in init self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, config.pad_token_id) File ".../transformers/configuration_utils.py", line 425, in getattribute return super().getattribute(key) AttributeError: 'Qwen3_5MoeConfig' object has no attribute 'vocab_size'

Fix Action

Fix / Workaround

from_pretrained knows to unwrap (config = config.get_text_config()) before dispatching, see auto_factory.py:L383-L396.

PR fix notes

PR #45770: Unwrap `text_config` in `AutoModelFor*.from_config`

Repository: huggingface/transformers
Author: jamesbraza
State: open | merged: False
Link: https://github.com/huggingface/transformers/pull/45770

Description (problem / solution / changelog)

Fixes https://github.com/huggingface/transformers/issues/45759

@ArthurZucker @Cyrilvallez @zucchini-nlp

Changed files

src/transformers/models/auto/auto_factory.py (modified, +9/-0)
tests/models/gemma3/test_modeling_gemma3.py (modified, +14/-4)
tests/models/qwen3_5/test_modeling_qwen3_5.py (modified, +22/-0)
tests/models/qwen3_5_moe/test_modeling_qwen3_5_moe.py (modified, +22/-0)

Code Example

import torch
from transformers import AutoConfig, AutoModelForCausalLM

config = AutoConfig.from_pretrained("Qwen/Qwen3.6-35B-A3B")
# How SkyRL meta-initializes its FSDP workers
# SEE: https://github.com/NovaSky-AI/SkyRL/blob/skyrl-v0.2.0/skyrl/backends/skyrl_train/workers/model_wrapper.py#L137-L139
with torch.device("meta"):
    model = AutoModelForCausalLM.from_config(config)

---

Traceback (most recent call last):
  File "repro.py", ...
    model = AutoModelForCausalLM.from_config(config)
  File ".../transformers/models/auto/auto_factory.py", line 241, in from_config
    return model_class._from_config(config, **kwargs)
  File ".../transformers/modeling_utils.py", line 1542, in _from_config
    model = cls(config, **kwargs)
  File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1906, in __init__
    self.model = Qwen3_5MoeTextModel(config)
  File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1339, in __init__
    self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, config.pad_token_id)
  File ".../transformers/configuration_utils.py", line 425, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'Qwen3_5MoeConfig' object has no attribute 'vocab_size'

RAW_BUFFERClick to expand / collapse

System Info

transformers version: 5.6.2
Platform: Linux-6.8.0-1043-nvidia-x86_64-with-glibc2.35
Python version: 3.12.13
Huggingface_hub version: 1.13.0
Safetensors version: 0.7.0
Accelerate version: 1.13.0
Accelerate config: not found
DeepSpeed version: not installed
PyTorch version (accelerator?): 2.10.0+cu129 (CUDA)
Using distributed or parallel set-up in script?: No
Using GPU in script?: Yes
GPU type: NVIDIA H100 80GB HBM3

Who can help?

@Cyrilvallez @zucchini-nlp @ArthurZucker

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

import torch
from transformers import AutoConfig, AutoModelForCausalLM

config = AutoConfig.from_pretrained("Qwen/Qwen3.6-35B-A3B")
# How SkyRL meta-initializes its FSDP workers
# SEE: https://github.com/NovaSky-AI/SkyRL/blob/skyrl-v0.2.0/skyrl/backends/skyrl_train/workers/model_wrapper.py#L137-L139
with torch.device("meta"):
    model = AutoModelForCausalLM.from_config(config)

This throws:

Traceback (most recent call last):
  File "repro.py", ...
    model = AutoModelForCausalLM.from_config(config)
  File ".../transformers/models/auto/auto_factory.py", line 241, in from_config
    return model_class._from_config(config, **kwargs)
  File ".../transformers/modeling_utils.py", line 1542, in _from_config
    model = cls(config, **kwargs)
  File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1906, in __init__
    self.model = Qwen3_5MoeTextModel(config)
  File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1339, in __init__
    self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, config.pad_token_id)
  File ".../transformers/configuration_utils.py", line 425, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'Qwen3_5MoeConfig' object has no attribute 'vocab_size'

For reference, from_pretrained("Qwen/Qwen3.6-35B-A3B") works fine, only from_config is broken.

This affects the Qwen3.5-MoE and Qwen3.6 family of models.

Expected behavior

from_pretrained knows to unwrap (config = config.get_text_config()) before dispatching, see auto_factory.py:L383-L396.

However, from_config does not have this unwrap, see auto_factory.py:L239-L241.

So the request is symmetry inside auto_factory.py: the same unwrap that exists for from_pretrained should exist for from_config.

Related issues:

Analogous case for Gemma 3: https://github.com/huggingface/transformers/issues/36683
Related vLLM issue: https://github.com/vllm-project/vllm/issues/36236

extent analysis

TL;DR

The issue can be fixed by unwrapping the config before passing it to from_config, similar to how from_pretrained handles it.

Guidance

The error occurs because from_config does not unwrap the config, unlike from_pretrained.
To fix this, you can manually unwrap the config before passing it to from_config using the get_text_config method.
Verify that the model is correctly initialized after making this change.
Check the auto_factory.py file for the correct implementation of from_config and from_pretrained to ensure symmetry.

Example

config = AutoConfig.from_pretrained("Qwen/Qwen3.6-35B-A3B")
config = config.get_text_config()  # unwrap the config
model = AutoModelForCausalLM.from_config(config)

Notes

This fix assumes that the get_text_config method is available for the Qwen3.5-MoE and Qwen3.6 family of models.

Recommendation

Apply the workaround by unwrapping the config before passing it to from_config, as shown in the example above. This will ensure symmetry with the from_pretrained method and allow the model to be correctly initialized.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

from_pretrained knows to unwrap (config = config.get_text_config()) before dispatching, see auto_factory.py:L383-L396.

However, from_config does not have this unwrap, see auto_factory.py:L239-L241.

So the request is symmetry inside auto_factory.py: the same unwrap that exists for from_pretrained should exist for from_config.

Related issues:

Analogous case for Gemma 3: https://github.com/huggingface/transformers/issues/36683
Related vLLM issue: https://github.com/vllm-project/vllm/issues/36236

#index setup #retrieval issue #search optimization #API routing #API middleware

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - ✅(Solved) Fix `AutoModelForCausalLM.from_config` does not unwrap `text_config` for composite Qwen 3.5 and 3.6 multimodal configs [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

PR fix notes

PR #45770: Unwrap `text_config` in `AutoModelFor*.from_config`

Description (problem / solution / changelog)

Changed files

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

transformers - ✅(Solved) Fix `AutoModelForCausalLM.from_config` does not unwrap `text_config` for composite Qwen 3.5 and 3.6 multimodal configs [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

PR fix notes

PR #45770: Unwrap text_config in AutoModelFor*.from_config

Description (problem / solution / changelog)

Changed files

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING

PR #45770: Unwrap `text_config` in `AutoModelFor*.from_config`