vllm - ✅(Solved) Fix [Bug]: Error when running Devstral 2 [3 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#38818Fetched 2026-04-08 02:34:42
View on GitHub
Comments
1
Participants
1
Timeline
8
Reactions
0
Participants
Timeline (top)
renamed ×2commented ×1cross-referenced ×1labeled ×1

Error Message

(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] EngineCore failed to start. (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] Traceback (most recent call last): (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] return func(*args, **kwargs) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in init (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] super().init( (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in init (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] self.model_executor = executor_class(vllm_config) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] return func(*args, **kwargs) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in init (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] self._init_executor() (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] self.driver_worker.load_model() (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] self.model_runner.load_model(load_dummy_weights=load_dummy_weights) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] return func(*args, **kwargs) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] self.model = model_loader.load_model( (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] return func(*args, **kwargs) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] model = initialize_model( (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] return func(*args, **kwargs) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] model = model_class(vllm_config=vllm_config, prefix=prefix) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in init (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] self.language_model = init_vllm_registered_model( (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] return initialize_model(vllm_config=vllm_config, prefix=prefix) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] return func(*args, **kwargs) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] model_class, _ = get_model_architecture(model_config) (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] tuple(getattr(model_config.hf_config, "architectures", [])), (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] TypeError: 'NoneType' object is not iterable (EngineCore pid=149) Process EngineCore: (EngineCore pid=149) Traceback (most recent call last): (EngineCore pid=149) File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap (EngineCore pid=149) self.run() (EngineCore pid=149) File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run (EngineCore pid=149) self._target(*self._args, **self._kwargs) (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1112, in run_engine_core (EngineCore pid=149) raise e (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core (EngineCore pid=149) engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) return func(*args, **kwargs) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in init (EngineCore pid=149) super().init( (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in init (EngineCore pid=149) self.model_executor = executor_class(vllm_config) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) return func(*args, **kwargs) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in init (EngineCore pid=149) self._init_executor() (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor (EngineCore pid=149) self.driver_worker.load_model() (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model (EngineCore pid=149) self.model_runner.load_model(load_dummy_weights=load_dummy_weights) (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) return func(*args, **kwargs) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model (EngineCore pid=149) self.model = model_loader.load_model( (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) return func(*args, **kwargs) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model (EngineCore pid=149) model = initialize_model( (EngineCore pid=149) ^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) return func(*args, **kwargs) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model (EngineCore pid=149) model = model_class(vllm_config=vllm_config, prefix=prefix) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in init (EngineCore pid=149) self.language_model = init_vllm_registered_model( (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model (EngineCore pid=149) return initialize_model(vllm_config=vllm_config, prefix=prefix) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (EngineCore pid=149) return func(*args, **kwargs) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model (EngineCore pid=149) model_class, _ = get_model_architecture(model_config) (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture (EngineCore pid=149) tuple(getattr(model_config.hf_config, "architectures", [])), (EngineCore pid=149) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=149) TypeError: 'NoneType' object is not iterable [rank0]:[W402 12:18:36.184187104 ProcessGroupNCCL.cpp:1648] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) (APIServer pid=71) Traceback (most recent call last): (APIServer pid=71) File "/usr/local/bin/vllm", line 10, in <module> (APIServer pid=71) sys.exit(main()) (APIServer pid=71) ^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 75, in main (APIServer pid=71) args.dispatch_function(args) (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 122, in cmd (APIServer pid=71) uvloop.run(run_server(args)) (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 96, in run (APIServer pid=71) return __asyncio.run( (APIServer pid=71) ^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run (APIServer pid=71) return runner.run(main) (APIServer pid=71) ^^^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run (APIServer pid=71) return self._loop.run_until_complete(task) (APIServer pid=71) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=71) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 48, in wrapper (APIServer pid=71) return await main (APIServer pid=71) ^^^^^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 684, in run_server (APIServer pid=71) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs) (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 698, in run_server_worker (APIServer pid=71) async with build_async_engine_client( (APIServer pid=71) File "/usr/lib/python3.12/contextlib.py", line 210, in aenter (APIServer pid=71) return await anext(self.gen) (APIServer pid=71) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client (APIServer pid=71) async with build_async_engine_client_from_engine_args( (APIServer pid=71) File "/usr/lib/python3.12/contextlib.py", line 210, in aenter (APIServer pid=71) return await anext(self.gen) (APIServer pid=71) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args (APIServer pid=71) async_llm = AsyncLLM.from_vllm_config( (APIServer pid=71) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 225, in from_vllm_config (APIServer pid=71) return cls( (APIServer pid=71) ^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 154, in init (APIServer pid=71) self.engine_core = EngineCoreClient.make_async_mp_client( (APIServer pid=71) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (APIServer pid=71) return func(*args, **kwargs) (APIServer pid=71) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 129, in make_async_mp_client (APIServer pid=71) return AsyncMPClient(*client_args) (APIServer pid=71) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper (APIServer pid=71) return func(*args, **kwargs) (APIServer pid=71) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 872, in init (APIServer pid=71) super().init( (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 534, in init (APIServer pid=71) with launch_core_engines( (APIServer pid=71) File "/usr/lib/python3.12/contextlib.py", line 144, in exit (APIServer pid=71) next(self.gen) (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1073, in launch_core_engines (APIServer pid=71) wait_for_engine_startup( (APIServer pid=71) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1132, in wait_for_engine_startup (APIServer pid=71) raise RuntimeError( (APIServer pid=71) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

Root Cause

(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] EngineCore failed to start.
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] Traceback (most recent call last):
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     super().__init__(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model_executor = executor_class(vllm_config)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self._init_executor()
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.driver_worker.load_model()
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model = model_loader.load_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model = initialize_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]             ^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model = model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.language_model = init_vllm_registered_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return initialize_model(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model_class, _ = get_model_architecture(model_config)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     tuple(getattr(model_config.hf_config, "architectures", [])),
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] TypeError: 'NoneType' object is not iterable
(EngineCore pid=149) Process EngineCore:
(EngineCore pid=149) Traceback (most recent call last):
(EngineCore pid=149)   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore pid=149)     self.run()
(EngineCore pid=149)   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
(EngineCore pid=149)     self._target(*self._args, **self._kwargs)
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1112, in run_engine_core
(EngineCore pid=149)     raise e
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core
(EngineCore pid=149)     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore pid=149)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in __init__
(EngineCore pid=149)     super().__init__(
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in __init__
(EngineCore pid=149)     self.model_executor = executor_class(vllm_config)
(EngineCore pid=149)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
(EngineCore pid=149)     self._init_executor()
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
(EngineCore pid=149)     self.driver_worker.load_model()
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
(EngineCore pid=149)     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model
(EngineCore pid=149)     self.model = model_loader.load_model(
(EngineCore pid=149)                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
(EngineCore pid=149)     model = initialize_model(
(EngineCore pid=149)             ^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
(EngineCore pid=149)     model = model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149)             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in __init__
(EngineCore pid=149)     self.language_model = init_vllm_registered_model(
(EngineCore pid=149)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model
(EngineCore pid=149)     return initialize_model(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model
(EngineCore pid=149)     model_class, _ = get_model_architecture(model_config)
(EngineCore pid=149)                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture
(EngineCore pid=149)     tuple(getattr(model_config.hf_config, "architectures", [])),
(EngineCore pid=149)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) TypeError: 'NoneType' object is not iterable
[rank0]:[W402 12:18:36.184187104 ProcessGroupNCCL.cpp:1648] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=71) Traceback (most recent call last):
(APIServer pid=71)   File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=71)     sys.exit(main())
(APIServer pid=71)              ^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=71)     args.dispatch_function(args)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 122, in cmd
(APIServer pid=71)     uvloop.run(run_server(args))
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=71)     return __asyncio.run(
(APIServer pid=71)            ^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
(APIServer pid=71)     return runner.run(main)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=71)     return self._loop.run_until_complete(task)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=71)     return await main
(APIServer pid=71)            ^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 684, in run_server
(APIServer pid=71)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 698, in run_server_worker
(APIServer pid=71)     async with build_async_engine_client(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=71)     return await anext(self.gen)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=71)     async with build_async_engine_client_from_engine_args(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=71)     return await anext(self.gen)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args
(APIServer pid=71)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=71)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 225, in from_vllm_config
(APIServer pid=71)     return cls(
(APIServer pid=71)            ^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 154, in __init__
(APIServer pid=71)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=71)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=71)     return func(*args, **kwargs)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 129, in make_async_mp_client
(APIServer pid=71)     return AsyncMPClient(*client_args)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=71)     return func(*args, **kwargs)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 872, in __init__
(APIServer pid=71)     super().__init__(
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 534, in __init__
(APIServer pid=71)     with launch_core_engines(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=71)     next(self.gen)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1073, in launch_core_engines
(APIServer pid=71)     wait_for_engine_startup(
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1132, in wait_for_engine_startup
(APIServer pid=71)     raise RuntimeError(
(APIServer pid=71) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

Fix Action

Fix / Workaround

(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] EngineCore failed to start.
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] Traceback (most recent call last):
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     super().__init__(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model_executor = executor_class(vllm_config)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self._init_executor()
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.driver_worker.load_model()
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model = model_loader.load_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model = initialize_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]             ^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model = model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.language_model = init_vllm_registered_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return initialize_model(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model_class, _ = get_model_architecture(model_config)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     tuple(getattr(model_config.hf_config, "architectures", [])),
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] TypeError: 'NoneType' object is not iterable
(EngineCore pid=149) Process EngineCore:
(EngineCore pid=149) Traceback (most recent call last):
(EngineCore pid=149)   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore pid=149)     self.run()
(EngineCore pid=149)   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
(EngineCore pid=149)     self._target(*self._args, **self._kwargs)
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1112, in run_engine_core
(EngineCore pid=149)     raise e
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core
(EngineCore pid=149)     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore pid=149)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in __init__
(EngineCore pid=149)     super().__init__(
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in __init__
(EngineCore pid=149)     self.model_executor = executor_class(vllm_config)
(EngineCore pid=149)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
(EngineCore pid=149)     self._init_executor()
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
(EngineCore pid=149)     self.driver_worker.load_model()
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
(EngineCore pid=149)     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model
(EngineCore pid=149)     self.model = model_loader.load_model(
(EngineCore pid=149)                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
(EngineCore pid=149)     model = initialize_model(
(EngineCore pid=149)             ^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
(EngineCore pid=149)     model = model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149)             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in __init__
(EngineCore pid=149)     self.language_model = init_vllm_registered_model(
(EngineCore pid=149)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model
(EngineCore pid=149)     return initialize_model(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model
(EngineCore pid=149)     model_class, _ = get_model_architecture(model_config)
(EngineCore pid=149)                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture
(EngineCore pid=149)     tuple(getattr(model_config.hf_config, "architectures", [])),
(EngineCore pid=149)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) TypeError: 'NoneType' object is not iterable
[rank0]:[W402 12:18:36.184187104 ProcessGroupNCCL.cpp:1648] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=71) Traceback (most recent call last):
(APIServer pid=71)   File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=71)     sys.exit(main())
(APIServer pid=71)              ^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=71)     args.dispatch_function(args)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 122, in cmd
(APIServer pid=71)     uvloop.run(run_server(args))
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=71)     return __asyncio.run(
(APIServer pid=71)            ^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
(APIServer pid=71)     return runner.run(main)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=71)     return self._loop.run_until_complete(task)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=71)     return await main
(APIServer pid=71)            ^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 684, in run_server
(APIServer pid=71)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 698, in run_server_worker
(APIServer pid=71)     async with build_async_engine_client(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=71)     return await anext(self.gen)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=71)     async with build_async_engine_client_from_engine_args(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=71)     return await anext(self.gen)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args
(APIServer pid=71)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=71)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 225, in from_vllm_config
(APIServer pid=71)     return cls(
(APIServer pid=71)            ^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 154, in __init__
(APIServer pid=71)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=71)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=71)     return func(*args, **kwargs)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 129, in make_async_mp_client
(APIServer pid=71)     return AsyncMPClient(*client_args)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=71)     return func(*args, **kwargs)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 872, in __init__
(APIServer pid=71)     super().__init__(
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 534, in __init__
(APIServer pid=71)     with launch_core_engines(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=71)     next(self.gen)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1073, in launch_core_engines
(APIServer pid=71)     wait_for_engine_startup(
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1132, in wait_for_engine_startup
(APIServer pid=71)     raise RuntimeError(
(APIServer pid=71) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

I was wondering if anyone could help me with this, it would be greatly appreciated. I am trying to run the model with HF format because I fine-tuned Devstral Small 2 (using unsloth) and I can only save it with HF format.

<details><summary>If necessary, here is the <code>config.json</code> file of the model :</summary> <code>{ "architectures": [ "Mistral3ForConditionalGeneration" ], "dtype": "bfloat16", "image_token_index": 10, "model_type": "mistral3", "multimodal_projector_bias": false, "projector_hidden_act": "gelu", "tie_word_embeddings": false, "quantization_config": { "activation_scheme": "static", "dequantize": false, "modules_to_not_convert": [ "model.vision_tower", "model.multi_modal_projector", "lm_head" ], "quant_method": "fp8", "weight_block_size": null }, "spatial_merge_size": 2, "text_config": { "attention_dropout": 0.0, "head_dim": 128, "hidden_act": "silu", "hidden_size": 5120, "initializer_range": 0.02, "intermediate_size": 32768, "max_position_embeddings": 393216, "model_type": "ministral3", "num_attention_heads": 32, "num_hidden_layers": 40, "num_key_value_heads": 8, "rms_norm_eps": 1e-05, "rope_parameters": { "beta_fast": 32.0, "beta_slow": 1.0, "factor": 48.0, "llama_4_scaling_beta": 0.1, "mscale": 1.0, "mscale_all_dim": 1.0, "original_max_position_embeddings": 8192, "rope_theta": 100000000.0, "rope_type": "yarn", "type": "yarn" }, "sliding_window": null, "use_cache": true, "vocab_size": 131072 }, "transformers_version": "5.0.0.dev0", "vision_config": { "attention_dropout": 0.0, "head_dim": 64, "hidden_act": "silu", "hidden_size": 1024, "image_size": 1540, "initializer_range": 0.02, "intermediate_size": 4096, "model_type": "pixtral", "num_attention_heads": 16, "num_channels": 3, "num_hidden_layers": 24, "patch_size": 14, "rope_parameters": { "rope_theta": 10000.0, "rope_type": "default" } }, "vision_feature_layer": -1 } </code></details>

PR fix notes

PR #38849: [Bug] Fix TypeError when hf_config.architectures is None during model loading

Description (problem / solution / changelog)

Purpose

Fixes #38818

PretrainedConfig in Transformers defines architectures: list[str] | None = None as a class-level attribute. Fine-tuned models saved without "architectures" in config.json (or configs loaded programmatically) will have hf_config.architectures = None.

The existing code used getattr(hf_config, "architectures", []), which only falls back to [] when the attribute is absent. Since architectures is always present on the class (as None), the default never fires. tuple(None) then raises: TypeError: 'NoneType' object is not iterable

The fix uses getattr(..., None) or [], which correctly normalises both the absent and the explicitly-None cases to an empty list.

Test Plan

Ran

 # Reproduces the bug with the old pattern
 from types import SimpleNamespace
 hf = SimpleNamespace(architectures=None)
 tuple(getattr(hf, 'architectures', []))   # TypeError: 'NoneType' object is not iterable

## Test Result
 # Fixed
 tuple(getattr(hf, 'architectures', None) or [])  # ()

## Changed files

- `tests/test_config.py` (modified, +77/-0)
- `vllm/config/vllm.py` (modified, +10/-0)
- `vllm/model_executor/model_loader/utils.py` (modified, +2/-2)


---

# PR #39293: [Bugfix][Model] Fix Devstral Small 2 HF format weight loading

- Repository: vllm-project/vllm
- Author: thomasmaindron
- State: closed | merged: True
- Link: https://github.com/vllm-project/vllm/pull/39293

## Description (problem / solution / changelog)

## Summary

Fix issues preventing Mistral3 models (e.g. Devstral Small 2) from loading in HF format (`--config-format hf --load-format hf --tokenizer-mode hf`):

- **FP8 scale name mismatch**: HF checkpoints use `activation_scale` and `weight_scale_inv` but vLLM's FP8 linear layers register them as `input_scale` and `weight_scale`. Added suffix remapping in `hf_to_vllm_mapper`.
- **Register `Ministral3ForCausalLM`** in the model registry, mapping it to the existing `MistralForCausalLM` implementation.
- **Remove redundant Pixtral-12B special case** in `mistral3.py` — now handled globally by `with_hf_config` (#38849).

Fixes #38818

## Test plan

- [x] Verified FP8 scale values are identical between native (`qscale_weight`/`qscale_act`) and HF (`weight_scale_inv`/`activation_scale`) formats by comparing tensors in safetensors files
- [x] Model loads successfully with `vllm serve devstral-small-2 --config-format hf --load-format hf --tokenizer-mode hf`
- [x] Inference works correctly on Open-WebUI

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context)

## Changed files

- `tests/models/registry.py` (modified, +1/-0)
- `vllm/model_executor/models/mistral3.py` (modified, +9/-7)
- `vllm/model_executor/models/registry.py` (modified, +1/-0)


---

# PR #39294: [Bugfix][Parser] Fix Mistral tool parser for HF tokenizers

- Repository: vllm-project/vllm
- Author: thomasmaindron
- State: open | merged: False
- Link: https://github.com/vllm-project/vllm/pull/39294

## Description (problem / solution / changelog)

## Summary

Fix Mistral tool parser failing with `IncompleteJSONError` when using `--tokenizer-mode hf` with `--tool-call-parser mistral`.

When using an HF tokenizer (e.g., with `--tokenizer-mode hf`), `_is_pre_v11_tokeniser` always returned `True` because it only checked for `MistralTokenizer` instances. This routed tool call parsing through the pre-v11 JSON path, but v11+ models (like Devstral Small 2) output tool calls in the format `[TOOL_CALLS]name[ARGS]{json_args}`, which is not valid JSON, causing `ijson.common.IncompleteJSONError`.

### Changes

- **`_is_pre_v11_tokeniser`**: For non-Mistral tokenizers, check if `[ARGS]` token exists in the vocabulary to detect v11+ equivalent tokenizers and route to the correct parsing path.
- **`extract_tool_calls`** (non-streaming): Strip `[ARGS]` from tool names, since HF tokenizers render it as visible text.
- **`_generate_delta_tool_call`** (streaming): Same `[ARGS]` stripping for the streaming path.
- **`extract_tool_calls_streaming`**: Use cached `self._is_pre_v11` instead of re-calling `_is_pre_v11_tokeniser` per streaming token.

Depends on #39293
Fixes #38818

## Test plan

- [x] Tool calling works correctly with Mistral Vibe on Devstral Small 2 in HF format
- [x] Tested with `examples/online_serving/openai_chat_completion_client_with_tools.py` against a vLLM server running Devstral Small 2 in HF format

### Test output

<details>
<summary>Output of <code>openai_chat_completion_client_with_tools.py</code></summary>

Chat completion results: ChatCompletion(id='chatcmpl-a091e080fca25f54', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='7NjJJ9G9x', function=Function(arguments='{"city": "Dallas", "state": "TX", "unit": "fahrenheit"}', name='get_current_weather'), type='function')], reasoning=None), stop_reason=None, token_ids=None)], created=1776413381, model='devstral', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=27, prompt_tokens=203, total_tokens=230, completion_tokens_details=None, prompt_tokens_details=None), prompt_logprobs=None, prompt_token_ids=None, kv_transfer_params=None)

chunks: ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None) ChoiceDeltaToolCall(index=0, id='IbAbEOSqY', function=ChoiceDeltaToolCallFunction(arguments='{"', name='get_current_weather'), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='city', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='":', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=' "', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='D', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='allas', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='",', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=' "', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='state', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='":', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=' "', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='TX', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='",', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=' "', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='unit', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='":', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=' "', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='fahren', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='heit', name=None), type='function') ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='"}', name=None), type='function') ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=None)

arguments: streamed tool call id: IbAbEOSqY streamed tool call name: get_current_weather streamed tool call arguments: {"city": "Dallas", "state": "TX", "unit": "fahrenheit"}


tool_to_call result: The weather in Dallas, Texas is 85 degrees fahrenheit. It is partly cloudly, with highs in the 90's. Chat completion2 results: ChatCompletion(id='chatcmpl-93bd1197c037251b', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="The temperature in Dallas, Texas is currently 85 degrees Fahrenheit. It is partly cloudy, with highs expected in the 90's.", refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[], reasoning=None), stop_reason=None, token_ids=None)], created=1776413405, model='devstral', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=34, prompt_tokens=274, total_tokens=308, completion_tokens_details=None, prompt_tokens_details=None), prompt_logprobs=None, prompt_token_ids=None, kv_transfer_params=None)


</details>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context)

## Changed files

- `vllm/tool_parsers/mistral_tool_parser.py` (modified, +13/-4)

Code Example

vllm serve /root/.cache/huggingface/devstral-small-2 --port 8080 --host 0.0.0.0 --gpu-memory-utilization 0.7 --served-model-name devstral --max-model-len 262144 --enable-auto-tool-choice --tool-call-parser mistral

---

(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] EngineCore failed to start.
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] Traceback (most recent call last):
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     super().__init__(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model_executor = executor_class(vllm_config)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self._init_executor()
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.driver_worker.load_model()
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model = model_loader.load_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model = initialize_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]             ^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model = model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.language_model = init_vllm_registered_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return initialize_model(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model_class, _ = get_model_architecture(model_config)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     tuple(getattr(model_config.hf_config, "architectures", [])),
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] TypeError: 'NoneType' object is not iterable
(EngineCore pid=149) Process EngineCore:
(EngineCore pid=149) Traceback (most recent call last):
(EngineCore pid=149)   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore pid=149)     self.run()
(EngineCore pid=149)   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
(EngineCore pid=149)     self._target(*self._args, **self._kwargs)
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1112, in run_engine_core
(EngineCore pid=149)     raise e
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core
(EngineCore pid=149)     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore pid=149)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in __init__
(EngineCore pid=149)     super().__init__(
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in __init__
(EngineCore pid=149)     self.model_executor = executor_class(vllm_config)
(EngineCore pid=149)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
(EngineCore pid=149)     self._init_executor()
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
(EngineCore pid=149)     self.driver_worker.load_model()
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
(EngineCore pid=149)     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model
(EngineCore pid=149)     self.model = model_loader.load_model(
(EngineCore pid=149)                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
(EngineCore pid=149)     model = initialize_model(
(EngineCore pid=149)             ^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
(EngineCore pid=149)     model = model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149)             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in __init__
(EngineCore pid=149)     self.language_model = init_vllm_registered_model(
(EngineCore pid=149)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model
(EngineCore pid=149)     return initialize_model(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model
(EngineCore pid=149)     model_class, _ = get_model_architecture(model_config)
(EngineCore pid=149)                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture
(EngineCore pid=149)     tuple(getattr(model_config.hf_config, "architectures", [])),
(EngineCore pid=149)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) TypeError: 'NoneType' object is not iterable
[rank0]:[W402 12:18:36.184187104 ProcessGroupNCCL.cpp:1648] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=71) Traceback (most recent call last):
(APIServer pid=71)   File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=71)     sys.exit(main())
(APIServer pid=71)              ^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=71)     args.dispatch_function(args)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 122, in cmd
(APIServer pid=71)     uvloop.run(run_server(args))
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=71)     return __asyncio.run(
(APIServer pid=71)            ^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
(APIServer pid=71)     return runner.run(main)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=71)     return self._loop.run_until_complete(task)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=71)     return await main
(APIServer pid=71)            ^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 684, in run_server
(APIServer pid=71)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 698, in run_server_worker
(APIServer pid=71)     async with build_async_engine_client(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=71)     return await anext(self.gen)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=71)     async with build_async_engine_client_from_engine_args(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=71)     return await anext(self.gen)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args
(APIServer pid=71)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=71)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 225, in from_vllm_config
(APIServer pid=71)     return cls(
(APIServer pid=71)            ^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 154, in __init__
(APIServer pid=71)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=71)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=71)     return func(*args, **kwargs)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 129, in make_async_mp_client
(APIServer pid=71)     return AsyncMPClient(*client_args)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=71)     return func(*args, **kwargs)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 872, in __init__
(APIServer pid=71)     super().__init__(
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 534, in __init__
(APIServer pid=71)     with launch_core_engines(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=71)     next(self.gen)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1073, in launch_core_engines
(APIServer pid=71)     wait_for_engine_startup(
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1132, in wait_for_engine_startup
(APIServer pid=71)     raise RuntimeError(
(APIServer pid=71) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
RAW_BUFFERClick to expand / collapse

Your current environment

<code>vllm version 0.18.2rc1.dev57+g551b3fb39.d20260402.cu132 transformers version 5.4.0 torch version 2.12.0.dev20260324+cu130</code>

🐛 Describe the bug

I recently encountered an issue after downloading Devstral Small 2 (https://huggingface.co/mistralai/Devstral-Small-2-24B-Instruct-2512). I can run the model with no problem when it uses Mistral format files (params.json, consolidated.safetensors and tekken.json) when running the following command :

vllm serve /root/.cache/huggingface/devstral-small-2 --port 8080 --host 0.0.0.0 --gpu-memory-utilization 0.7 --served-model-name devstral --max-model-len 262144 --enable-auto-tool-choice --tool-call-parser mistral

However, it crashes when I try to use the HF format files with the same command (I simply removed the previous files out of the model folder), with the following error :

(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] EngineCore failed to start.
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] Traceback (most recent call last):
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     super().__init__(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model_executor = executor_class(vllm_config)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self._init_executor()
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.driver_worker.load_model()
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.model = model_loader.load_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model = initialize_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]             ^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model = model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in __init__
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     self.language_model = init_vllm_registered_model(
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return initialize_model(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     return func(*args, **kwargs)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     model_class, _ = get_model_architecture(model_config)
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     tuple(getattr(model_config.hf_config, "architectures", [])),
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) ERROR 04-02 12:18:35 [core.py:1108] TypeError: 'NoneType' object is not iterable
(EngineCore pid=149) Process EngineCore:
(EngineCore pid=149) Traceback (most recent call last):
(EngineCore pid=149)   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore pid=149)     self.run()
(EngineCore pid=149)   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
(EngineCore pid=149)     self._target(*self._args, **self._kwargs)
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1112, in run_engine_core
(EngineCore pid=149)     raise e
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1082, in run_engine_core
(EngineCore pid=149)     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
(EngineCore pid=149)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 848, in __init__
(EngineCore pid=149)     super().__init__(
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 114, in __init__
(EngineCore pid=149)     self.model_executor = executor_class(vllm_config)
(EngineCore pid=149)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
(EngineCore pid=149)     self._init_executor()
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
(EngineCore pid=149)     self.driver_worker.load_model()
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
(EngineCore pid=149)     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4749, in load_model
(EngineCore pid=149)     self.model = model_loader.load_model(
(EngineCore pid=149)                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
(EngineCore pid=149)     model = initialize_model(
(EngineCore pid=149)             ^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
(EngineCore pid=149)     model = model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149)             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/mistral3.py", line 437, in __init__
(EngineCore pid=149)     self.language_model = init_vllm_registered_model(
(EngineCore pid=149)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 372, in init_vllm_registered_model
(EngineCore pid=149)     return initialize_model(vllm_config=vllm_config, prefix=prefix)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(EngineCore pid=149)     return func(*args, **kwargs)
(EngineCore pid=149)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 47, in initialize_model
(EngineCore pid=149)     model_class, _ = get_model_architecture(model_config)
(EngineCore pid=149)                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149)   File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 218, in get_model_architecture
(EngineCore pid=149)     tuple(getattr(model_config.hf_config, "architectures", [])),
(EngineCore pid=149)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=149) TypeError: 'NoneType' object is not iterable
[rank0]:[W402 12:18:36.184187104 ProcessGroupNCCL.cpp:1648] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=71) Traceback (most recent call last):
(APIServer pid=71)   File "/usr/local/bin/vllm", line 10, in <module>
(APIServer pid=71)     sys.exit(main())
(APIServer pid=71)              ^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=71)     args.dispatch_function(args)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py", line 122, in cmd
(APIServer pid=71)     uvloop.run(run_server(args))
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=71)     return __asyncio.run(
(APIServer pid=71)            ^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
(APIServer pid=71)     return runner.run(main)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=71)     return self._loop.run_until_complete(task)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=71)     return await main
(APIServer pid=71)            ^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 684, in run_server
(APIServer pid=71)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 698, in run_server_worker
(APIServer pid=71)     async with build_async_engine_client(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=71)     return await anext(self.gen)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=71)     async with build_async_engine_client_from_engine_args(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=71)     return await anext(self.gen)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args
(APIServer pid=71)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=71)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 225, in from_vllm_config
(APIServer pid=71)     return cls(
(APIServer pid=71)            ^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 154, in __init__
(APIServer pid=71)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=71)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=71)     return func(*args, **kwargs)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 129, in make_async_mp_client
(APIServer pid=71)     return AsyncMPClient(*client_args)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=71)     return func(*args, **kwargs)
(APIServer pid=71)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 872, in __init__
(APIServer pid=71)     super().__init__(
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 534, in __init__
(APIServer pid=71)     with launch_core_engines(
(APIServer pid=71)   File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=71)     next(self.gen)
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1073, in launch_core_engines
(APIServer pid=71)     wait_for_engine_startup(
(APIServer pid=71)   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1132, in wait_for_engine_startup
(APIServer pid=71)     raise RuntimeError(
(APIServer pid=71) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

I was wondering if anyone could help me with this, it would be greatly appreciated. I am trying to run the model with HF format because I fine-tuned Devstral Small 2 (using unsloth) and I can only save it with HF format.

<details><summary>If necessary, here is the <code>config.json</code> file of the model :</summary> <code>{ "architectures": [ "Mistral3ForConditionalGeneration" ], "dtype": "bfloat16", "image_token_index": 10, "model_type": "mistral3", "multimodal_projector_bias": false, "projector_hidden_act": "gelu", "tie_word_embeddings": false, "quantization_config": { "activation_scheme": "static", "dequantize": false, "modules_to_not_convert": [ "model.vision_tower", "model.multi_modal_projector", "lm_head" ], "quant_method": "fp8", "weight_block_size": null }, "spatial_merge_size": 2, "text_config": { "attention_dropout": 0.0, "head_dim": 128, "hidden_act": "silu", "hidden_size": 5120, "initializer_range": 0.02, "intermediate_size": 32768, "max_position_embeddings": 393216, "model_type": "ministral3", "num_attention_heads": 32, "num_hidden_layers": 40, "num_key_value_heads": 8, "rms_norm_eps": 1e-05, "rope_parameters": { "beta_fast": 32.0, "beta_slow": 1.0, "factor": 48.0, "llama_4_scaling_beta": 0.1, "mscale": 1.0, "mscale_all_dim": 1.0, "original_max_position_embeddings": 8192, "rope_theta": 100000000.0, "rope_type": "yarn", "type": "yarn" }, "sliding_window": null, "use_cache": true, "vocab_size": 131072 }, "transformers_version": "5.0.0.dev0", "vision_config": { "attention_dropout": 0.0, "head_dim": 64, "hidden_act": "silu", "hidden_size": 1024, "image_size": 1540, "initializer_range": 0.02, "intermediate_size": 4096, "model_type": "pixtral", "num_attention_heads": 16, "num_channels": 3, "num_hidden_layers": 24, "patch_size": 14, "rope_parameters": { "rope_theta": 10000.0, "rope_type": "default" } }, "vision_feature_layer": -1 } </code></details>

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

The issue is likely due to the model configuration not being compatible with the HF format, causing a TypeError: 'NoneType' object is not iterable error when trying to load the model.

Guidance

  1. Check model configuration: Verify that the config.json file is correctly formatted and compatible with the HF format.
  2. Verify model architecture: Ensure that the model architecture specified in the config.json file matches the one expected by the HF format.
  3. Consult documentation: Refer to the VLLM documentation and the Hugging Face model documentation to ensure that the model is being loaded correctly.
  4. Test with a different model: Try loading a different model in HF format to see if the issue is specific to the Devstral Small 2 model.

Example

No code example is provided as the issue seems to be related to model configuration and compatibility rather than code.

Notes

The error message suggests that there is an issue with the model configuration, specifically with the architectures field. The config.json file provided seems to be correctly formatted, but it's possible that there is an issue with the way the model is being loaded or the compatibility of the model with the HF format.

Recommendation

Apply a workaround by checking the model configuration and ensuring that it is compatible with the HF format. If the issue persists, try testing with a different model or seeking further assistance from the VLLM community or Hugging Face support.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING