vllm - ✅(Solved) Fix [Bug]: Pixtral model(Migstral-Small-2509) will raise AttributeError NoneType Size on graph mode. Eager mode is ok [1 pull requests, 1 participants]

vllm2026-05-19 03:31:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#43042•Fetched 2026-05-20 03:40:09

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ZhuQi-seu

Participants

ZhuQi-seu

Timeline (top)

cross-referenced ×3labeled ×1referenced ×1

Error Message

Fix Action

Fixed

Fixed by PR: [Bugfix] Fix AttributeError in pixtral/mistral3 graph mode (issue #43042) (https://github.com/vllm-project/vllm/pull/43052)

PR fix notes

PR #43052: [Bugfix] Fix AttributeError in pixtral/mistral3 graph mode (issue #43042)

Repository: vllm-project/vllm
Author: VinayJogani14
State: open | merged: False
Link: https://github.com/vllm-project/vllm/pull/43052

Description (problem / solution / changelog)

Summary

Fixes #43042.

PixtralForConditionalGeneration and Mistral3ForConditionalGeneration call self.language_model.model(...) in their forward methods. This invokes the @support_torch_compile custom __call__ on the inner MistralModel.

In graph-capture backends that do not set torch.compiler.is_compiling()=True (such as Ascend ACL with FULL_DECODE_ONLY mode), MistralModel.__call__ sees is_compiling()=False and attempts a nested torch.compile on itself. During dynamo tracing of MistralModel.forward, call_size(x, i) is invoked with x=None, raising:

AttributeError: 'NoneType' object has no attribute 'size'

Fix: Call .forward() directly, bypassing the @support_torch_compile wrapper. This is safe because:

Eager mode (do_not_compile=True): the custom __call__ already calls .forward() directly — no change in behaviour.
torch.compile mode (is_compiling()=True): the custom __call__ already calls .forward() directly — no change in behaviour.
Non-torch.compile graph capture (ACL, etc.): the nested compilation is avoided; the outer graph capture records the entire MistralModel.forward as part of PixtralForConditionalGeneration.forward.

Why this is not a duplicate

No open PR addresses issue #43042 (checked via gh pr list --repo vllm-project/vllm --state open --search "43042").

Test commands run

The Ascend ACL hardware required to reproduce the exact crash is not available in this environment. However:

Syntax check passed on both changed files.
In all non-ACL execution paths (eager and standard torch.compile), MistralModel.__call__ already reduces to a direct .forward() call, so this change is a no-op for those paths.
Existing pixtral tests (tests/models/multimodal/generation/test_pixtral.py) cover the eager and standard compilation paths and would catch regressions.

AI assistance disclosure

This fix was developed with AI assistance (Claude). The human submitter reviewed every changed line, understands the root cause (nested compilation triggered by missing is_compiling()=True in ACL graph capture context), and confirms this is not a duplicate of any existing open PR.

Changed files

vllm/model_executor/models/mistral3.py (modified, +1/-1)
vllm/model_executor/models/pixtral.py (modified, +1/-1)

Code Example

Your output of `python collect_env.py` here

RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>

Your output of `python collect_env.py` here

</details> vllm service startup config: vllm serve {model_path} \ --served-model-name "Magistral-Small-2509" \ --load_format mistral --tool-call-parser mistral \ --tokenizer_mode mistral --config_format mistral \ --enable-auto-tool-choice \ --trust-remote-code \ --reasoning-parser mistral \ --limit-mm-per-prompt '{"image":0}' \ --host 0.0.0.0 \ --port 8010 \ --data-parallel-size 1 \ --tensor-parallel-size 4 \ --max-model-len 32768 \ --max-num-batched-tokens 8192 \ --max-num-seqs 16 \ --gpu-memory-utilization 0.94 \ --async-scheduling \ --mm-processor-cache-gb 0 \ --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY","cudagraph_capture_sizes":[1,2,4,8,16]}' \ --additional-config '{"enable_cpu_binding":true, "multistream_overlap_shared_expert": true}' \ #--enforce-eager while on `--enforce-eager` vllm serve start successful and infer successful. But `FULL_DECODE_ONLY` mode will failed!

🐛 Describe the bug

WorkerProc hit an exception. (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] Traceback (most recent call last): (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.worker.execute_model(scheduler_output) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/worker.py", line 398, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = self.model_runner.execute_model(scheduler_output, intermediate_tensors) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1389, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self._model_forward( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1839, in _model_forward (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.model( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/compilation/acl_graph.py", line 117, in call (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] WorkerProc hit an exception. (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.runnable(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] Traceback (most recent call last): (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.worker.execute_model(scheduler_output) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return forward_call(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/worker.py", line 398, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/model_executor/models/pixtral.py", line 381, in forward (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = self.model_runner.execute_model(scheduler_output, intermediate_tensors) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.language_model.model( (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/compilation/decorators.py", line 503, in call (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return TorchCompileWithNoGuardsWrapper.call(self, *args, **kwargs) # type: ignore[arg-type] (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1389, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/compilation/wrapper.py", line 187, in call (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self._model_forward( (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] WorkerProc hit an exception. (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self._call_with_optional_nvtx_range( (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] Traceback (most recent call last): (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1839, in _model_forward (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/compilation/wrapper.py", line 76, in _call_with_optional_nvtx_range (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.model( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return callable_fn(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = func(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/compilation/acl_graph.py", line 117, in call (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/model_executor/models/mistral.py", line 217, in forward (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.runnable(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] def forward( (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.worker.execute_model(scheduler_output) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/_compile.py", line 53, in inner (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return disable_fn(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/worker.py", line 398, in execute_model (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self._call_impl(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = self.model_runner.execute_model(scheduler_output, intermediate_tensors) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 1044, in _fn (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return fn(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return forward_call(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 4656, in call_size (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/model_executor/models/pixtral.py", line 381, in forward (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return x.size(i) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1389, in execute_model (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.language_model.model( (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self._model_forward( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] AttributeError: 'NoneType' object has no attribute 'size'

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#API routing #API middleware #SSR setup #ISR setup #authentication setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - ✅(Solved) Fix [Bug]: Pixtral model(Migstral-Small-2509) will raise AttributeError NoneType Size on graph mode. Eager mode is ok [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #43052: [Bugfix] Fix AttributeError in pixtral/mistral3 graph mode (issue #43042)

Description (problem / solution / changelog)

Summary

Why this is not a duplicate

Test commands run

AI assistance disclosure

Changed files

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Still need to ship something?

TRENDING

vllm - ✅(Solved) Fix [Bug]: Pixtral model(Migstral-Small-2509) will raise AttributeError NoneType Size on graph mode. Eager mode is ok [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #43052: [Bugfix] Fix AttributeError in pixtral/mistral3 graph mode (issue #43042)

Description (problem / solution / changelog)

Summary

Why this is not a duplicate

Test commands run

AI assistance disclosure

Changed files

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Still need to ship something?

RELATED_DISCOVERY

TRENDING