vllm - ✅(Solved) Fix [Bug]: Pixtral model(Migstral-Small-2509) will raise AttributeError NoneType Size on graph mode. Eager mode is ok [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#43042Fetched 2026-05-20 03:40:09
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×3labeled ×1referenced ×1

Error Message

WorkerProc hit an exception. (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] Traceback (most recent call last): (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.worker.execute_model(scheduler_output) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/worker.py", line 398, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = self.model_runner.execute_model(scheduler_output, intermediate_tensors) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1389, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self._model_forward( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1839, in _model_forward (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.model( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/compilation/acl_graph.py", line 117, in call (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] WorkerProc hit an exception. (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.runnable(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] Traceback (most recent call last): (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.worker.execute_model(scheduler_output) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return forward_call(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/worker.py", line 398, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/model_executor/models/pixtral.py", line 381, in forward (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = self.model_runner.execute_model(scheduler_output, intermediate_tensors) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.language_model.model(

Fix Action

Fixed

PR fix notes

PR #43052: [Bugfix] Fix AttributeError in pixtral/mistral3 graph mode (issue #43042)

Description (problem / solution / changelog)

Summary

Fixes #43042.

PixtralForConditionalGeneration and Mistral3ForConditionalGeneration call self.language_model.model(...) in their forward methods. This invokes the @support_torch_compile custom __call__ on the inner MistralModel.

In graph-capture backends that do not set torch.compiler.is_compiling()=True (such as Ascend ACL with FULL_DECODE_ONLY mode), MistralModel.__call__ sees is_compiling()=False and attempts a nested torch.compile on itself. During dynamo tracing of MistralModel.forward, call_size(x, i) is invoked with x=None, raising:

AttributeError: 'NoneType' object has no attribute 'size'

Fix: Call .forward() directly, bypassing the @support_torch_compile wrapper. This is safe because:

  • Eager mode (do_not_compile=True): the custom __call__ already calls .forward() directly — no change in behaviour.
  • torch.compile mode (is_compiling()=True): the custom __call__ already calls .forward() directly — no change in behaviour.
  • Non-torch.compile graph capture (ACL, etc.): the nested compilation is avoided; the outer graph capture records the entire MistralModel.forward as part of PixtralForConditionalGeneration.forward.

Why this is not a duplicate

No open PR addresses issue #43042 (checked via gh pr list --repo vllm-project/vllm --state open --search "43042").

Test commands run

The Ascend ACL hardware required to reproduce the exact crash is not available in this environment. However:

  • Syntax check passed on both changed files.
  • In all non-ACL execution paths (eager and standard torch.compile), MistralModel.__call__ already reduces to a direct .forward() call, so this change is a no-op for those paths.
  • Existing pixtral tests (tests/models/multimodal/generation/test_pixtral.py) cover the eager and standard compilation paths and would catch regressions.

AI assistance disclosure

This fix was developed with AI assistance (Claude). The human submitter reviewed every changed line, understands the root cause (nested compilation triggered by missing is_compiling()=True in ACL graph capture context), and confirms this is not a duplicate of any existing open PR.

Changed files

  • vllm/model_executor/models/mistral3.py (modified, +1/-1)
  • vllm/model_executor/models/pixtral.py (modified, +1/-1)

Code Example

Your output of `python collect_env.py` here
RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>
Your output of `python collect_env.py` here
</details> vllm service startup config:<br /> vllm serve {model_path} \<br /> --served-model-name "Magistral-Small-2509" \<br /> --load_format mistral --tool-call-parser mistral \<br /> --tokenizer_mode mistral --config_format mistral \<br /> --enable-auto-tool-choice \<br /> --trust-remote-code \<br /> --reasoning-parser mistral \<br /> --limit-mm-per-prompt '{"image":0}' \<br /> --host 0.0.0.0 \<br /> --port 8010 \<br /> --data-parallel-size 1 \<br /> --tensor-parallel-size 4 \<br /> --max-model-len 32768 \<br /> --max-num-batched-tokens 8192 \<br /> --max-num-seqs 16 \<br /> --gpu-memory-utilization 0.94 \<br /> --async-scheduling \<br /> --mm-processor-cache-gb 0 \<br /> --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY","cudagraph_capture_sizes":[1,2,4,8,16]}' \<br /> --additional-config '{"enable_cpu_binding":true, "multistream_overlap_shared_expert": true}' \<br /> #--enforce-eager<br /> while on `--enforce-eager` vllm serve start successful and infer successful. But `FULL_DECODE_ONLY` mode will failed!<br />

🐛 Describe the bug

WorkerProc hit an exception. (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] Traceback (most recent call last): (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.worker.execute_model(scheduler_output) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/worker.py", line 398, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = self.model_runner.execute_model(scheduler_output, intermediate_tensors) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1389, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self._model_forward( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1839, in _model_forward (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.model( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/compilation/acl_graph.py", line 117, in call (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] WorkerProc hit an exception. (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.runnable(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] Traceback (most recent call last): (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self._call_impl(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.worker.execute_model(scheduler_output) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return forward_call(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/worker.py", line 398, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/model_executor/models/pixtral.py", line 381, in forward (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = self.model_runner.execute_model(scheduler_output, intermediate_tensors) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.language_model.model( (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/compilation/decorators.py", line 503, in call (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return TorchCompileWithNoGuardsWrapper.call(self, *args, **kwargs) # type: ignore[arg-type] (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1389, in execute_model (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/compilation/wrapper.py", line 187, in call (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self._model_forward( (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] WorkerProc hit an exception. (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self._call_with_optional_nvtx_range( (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] Traceback (most recent call last): (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1839, in _model_forward (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/compilation/wrapper.py", line 76, in _call_with_optional_nvtx_range (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.model( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return callable_fn(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = func(*args, **kwargs) (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/compilation/acl_graph.py", line 117, in call (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/model_executor/models/mistral.py", line 217, in forward (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.runnable(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] def forward( (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self.worker.execute_model(scheduler_output) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/_compile.py", line 53, in inner (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return disable_fn(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/worker.py", line 398, in execute_model (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return self._call_impl(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] output = self.model_runner.execute_model(scheduler_output, intermediate_tensors) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 1044, in _fn (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return fn(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return forward_call(*args, **kwargs) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return func(*args, **kwargs) (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 4656, in call_size (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm/model_executor/models/pixtral.py", line 381, in forward (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] return x.size(i) (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] File "/usr/local/python3.11.10/lib/python3.11/site-packages/vllm_ascend/worker/model_runner_v1.py", line 1389, in execute_model (Worker_TP1 pid=2020) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self.language_model.model( (Worker_TP3 pid=2022) ERROR 05-12 09:17:24 [multiproc_executor.py:932] hidden_states = self._model_forward( (Worker_TP0 pid=2019) ERROR 05-12 09:17:24 [multiproc_executor.py:932] AttributeError: 'NoneType' object has no attribute 'size'

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - ✅(Solved) Fix [Bug]: Pixtral model(Migstral-Small-2509) will raise AttributeError NoneType Size on graph mode. Eager mode is ok [1 pull requests, 1 participants]