transformers - ✅(Solved) Fix `AutoModelForCausalLM.from_config` does not unwrap `text_config` for composite Qwen 3.5 and 3.6 multimodal configs [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#45759Fetched 2026-05-04 04:58:11
View on GitHub
Comments
0
Participants
1
Timeline
8
Reactions
0
Participants
Timeline (top)
mentioned ×3subscribed ×3cross-referenced ×1labeled ×1

Error Message

Traceback (most recent call last): File "repro.py", ... model = AutoModelForCausalLM.from_config(config) File ".../transformers/models/auto/auto_factory.py", line 241, in from_config return model_class._from_config(config, **kwargs) File ".../transformers/modeling_utils.py", line 1542, in _from_config model = cls(config, **kwargs) File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1906, in init self.model = Qwen3_5MoeTextModel(config) File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1339, in init self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, config.pad_token_id) File ".../transformers/configuration_utils.py", line 425, in getattribute return super().getattribute(key) AttributeError: 'Qwen3_5MoeConfig' object has no attribute 'vocab_size'

Fix Action

Fix / Workaround

from_pretrained knows to unwrap (config = config.get_text_config()) before dispatching, see auto_factory.py:L383-L396.

PR fix notes

PR #45770: Unwrap text_config in AutoModelFor*.from_config

Description (problem / solution / changelog)

Fixes https://github.com/huggingface/transformers/issues/45759

@ArthurZucker @Cyrilvallez @zucchini-nlp

Changed files

  • src/transformers/models/auto/auto_factory.py (modified, +9/-0)
  • tests/models/gemma3/test_modeling_gemma3.py (modified, +14/-4)
  • tests/models/qwen3_5/test_modeling_qwen3_5.py (modified, +22/-0)
  • tests/models/qwen3_5_moe/test_modeling_qwen3_5_moe.py (modified, +22/-0)

Code Example

import torch
from transformers import AutoConfig, AutoModelForCausalLM

config = AutoConfig.from_pretrained("Qwen/Qwen3.6-35B-A3B")
# How SkyRL meta-initializes its FSDP workers
# SEE: https://github.com/NovaSky-AI/SkyRL/blob/skyrl-v0.2.0/skyrl/backends/skyrl_train/workers/model_wrapper.py#L137-L139
with torch.device("meta"):
    model = AutoModelForCausalLM.from_config(config)

---

Traceback (most recent call last):
  File "repro.py", ...
    model = AutoModelForCausalLM.from_config(config)
  File ".../transformers/models/auto/auto_factory.py", line 241, in from_config
    return model_class._from_config(config, **kwargs)
  File ".../transformers/modeling_utils.py", line 1542, in _from_config
    model = cls(config, **kwargs)
  File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1906, in __init__
    self.model = Qwen3_5MoeTextModel(config)
  File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1339, in __init__
    self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, config.pad_token_id)
  File ".../transformers/configuration_utils.py", line 425, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'Qwen3_5MoeConfig' object has no attribute 'vocab_size'
RAW_BUFFERClick to expand / collapse

System Info

  • transformers version: 5.6.2
  • Platform: Linux-6.8.0-1043-nvidia-x86_64-with-glibc2.35
  • Python version: 3.12.13
  • Huggingface_hub version: 1.13.0
  • Safetensors version: 0.7.0
  • Accelerate version: 1.13.0
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (accelerator?): 2.10.0+cu129 (CUDA)
  • Using distributed or parallel set-up in script?: No
  • Using GPU in script?: Yes
  • GPU type: NVIDIA H100 80GB HBM3

Who can help?

@Cyrilvallez @zucchini-nlp @ArthurZucker

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

import torch
from transformers import AutoConfig, AutoModelForCausalLM

config = AutoConfig.from_pretrained("Qwen/Qwen3.6-35B-A3B")
# How SkyRL meta-initializes its FSDP workers
# SEE: https://github.com/NovaSky-AI/SkyRL/blob/skyrl-v0.2.0/skyrl/backends/skyrl_train/workers/model_wrapper.py#L137-L139
with torch.device("meta"):
    model = AutoModelForCausalLM.from_config(config)

This throws:

Traceback (most recent call last):
  File "repro.py", ...
    model = AutoModelForCausalLM.from_config(config)
  File ".../transformers/models/auto/auto_factory.py", line 241, in from_config
    return model_class._from_config(config, **kwargs)
  File ".../transformers/modeling_utils.py", line 1542, in _from_config
    model = cls(config, **kwargs)
  File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1906, in __init__
    self.model = Qwen3_5MoeTextModel(config)
  File ".../transformers/models/qwen3_5_moe/modeling_qwen3_5_moe.py", line 1339, in __init__
    self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, config.pad_token_id)
  File ".../transformers/configuration_utils.py", line 425, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'Qwen3_5MoeConfig' object has no attribute 'vocab_size'

For reference, from_pretrained("Qwen/Qwen3.6-35B-A3B") works fine, only from_config is broken.

This affects the Qwen3.5-MoE and Qwen3.6 family of models.

Expected behavior

from_pretrained knows to unwrap (config = config.get_text_config()) before dispatching, see auto_factory.py:L383-L396.

However, from_config does not have this unwrap, see auto_factory.py:L239-L241.

So the request is symmetry inside auto_factory.py: the same unwrap that exists for from_pretrained should exist for from_config.

Related issues:

extent analysis

TL;DR

The issue can be fixed by unwrapping the config before passing it to from_config, similar to how from_pretrained handles it.

Guidance

  • The error occurs because from_config does not unwrap the config, unlike from_pretrained.
  • To fix this, you can manually unwrap the config before passing it to from_config using the get_text_config method.
  • Verify that the model is correctly initialized after making this change.
  • Check the auto_factory.py file for the correct implementation of from_config and from_pretrained to ensure symmetry.

Example

config = AutoConfig.from_pretrained("Qwen/Qwen3.6-35B-A3B")
config = config.get_text_config()  # unwrap the config
model = AutoModelForCausalLM.from_config(config)

Notes

This fix assumes that the get_text_config method is available for the Qwen3.5-MoE and Qwen3.6 family of models.

Recommendation

Apply the workaround by unwrapping the config before passing it to from_config, as shown in the example above. This will ensure symmetry with the from_pretrained method and allow the model to be correctly initialized.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

from_pretrained knows to unwrap (config = config.get_text_config()) before dispatching, see auto_factory.py:L383-L396.

However, from_config does not have this unwrap, see auto_factory.py:L239-L241.

So the request is symmetry inside auto_factory.py: the same unwrap that exists for from_pretrained should exist for from_config.

Related issues:

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - ✅(Solved) Fix `AutoModelForCausalLM.from_config` does not unwrap `text_config` for composite Qwen 3.5 and 3.6 multimodal configs [1 pull requests, 1 participants]