vllm - ✅(Solved) Fix [Bug]: UX of Weight Prefetchers on Startup Failure [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#40564Fetched 2026-04-22 07:43:48
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Timeline (top)
commented ×1labeled ×1mentioned ×1subscribed ×1

Error Message

(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last): (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] await asyncio.to_thread(_prefetch_checkpoint, path) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] return await loop.run_in_executor(None, func_call) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] executor.submit(func, *args), loop=self) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] raise RuntimeError('cannot schedule new futures after shutdown') (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00010-of-00016.safetensors'. (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last): (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] await asyncio.to_thread(_prefetch_checkpoint, path) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] return await loop.run_in_executor(None, func_call) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] executor.submit(func, *args), loop=self) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] raise RuntimeError('cannot schedule new futures after shutdown') (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00011-of-00016.safetensors'. (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last): (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] await asyncio.to_thread(_prefetch_checkpoint, path) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] return await loop.run_in_executor(None, func_call) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] executor.submit(func, *args), loop=self) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] raise RuntimeError('cannot schedule new futures after shutdown') (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00012-of-00016.safetensors'. (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last): (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] await asyncio.to_thread(_prefetch_checkpoint, path) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] return await loop.run_in_executor(None, func_call) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] executor.submit(func, *args), loop=self) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] raise RuntimeError('cannot schedule new futures after shutdown') (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00013-of-00016.safetensors'. (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last): (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] await asyncio.to_thread(_prefetch_checkpoint, path) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] return await loop.run_in_executor(None, func_call) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] executor.submit(func, *args), loop=self) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] raise RuntimeError('cannot schedule new futures after shutdown') (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00014-of-00016.safetensors'. (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last): (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] await asyncio.to_thread(_prefetch_checkpoint, path) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] return await loop.run_in_executor(None, func_call) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] executor.submit(func, *args), loop=self) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] raise RuntimeError('cannot schedule new futures after shutdown') (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00015-of-00016.safetensors'. (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last): (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] await asyncio.to_thread(_prefetch_checkpoint, path) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] return await loop.run_in_executor(None, func_call) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] executor.submit(func, *args), loop=self) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] raise RuntimeError('cannot schedule new futures after shutdown') (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00016-of-00016.safetensors'. (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last): (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] await asyncio.to_thread(_prefetch_checkpoint, path) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] return await loop.run_in_executor(None, func_call) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] executor.submit(func, *args), loop=self) (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] raise RuntimeError('cannot schedule new futures after shutdown') (EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown (EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 10% (2/16) (EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 20% (4/16) (EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 30% (5/16) (EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 40% (7/16) (EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 50% (8/16) (EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:869] Prefetching checkpoint files into page cache finished in 0.94s [rank0]:[W421 21:28:15.880969765 ProcessGroupNCCL.cpp:1575] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) (APIServer pid=212858) Traceback (most recent call last): (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/.venv/bin/vllm", line 10, in <module> (APIServer pid=212858) sys.exit(main()) (APIServer pid=212858) ^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/main.py", line 75, in main (APIServer pid=212858) args.dispatch_function(args) (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/serve.py", line 122, in cmd (APIServer pid=212858) uvloop.run(run_server(args)) (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/init.py", line 96, in run (APIServer pid=212858) return __asyncio.run( (APIServer pid=212858) ^^^^^^^^^^^^^^ (APIServer pid=212858) File "/usr/lib64/python3.12/asyncio/runners.py", line 195, in run (APIServer pid=212858) return runner.run(main) (APIServer pid=212858) ^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/usr/lib64/python3.12/asyncio/runners.py", line 118, in run (APIServer pid=212858) return self._loop.run_until_complete(task) (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/init.py", line 48, in wrapper (APIServer pid=212858) return await main (APIServer pid=212858) ^^^^^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 678, in run_server (APIServer pid=212858) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs) (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 692, in run_server_worker (APIServer pid=212858) async with build_async_engine_client( (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/usr/lib64/python3.12/contextlib.py", line 210, in aenter (APIServer pid=212858) return await anext(self.gen) (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client (APIServer pid=212858) async with build_async_engine_client_from_engine_args( (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/usr/lib64/python3.12/contextlib.py", line 210, in aenter (APIServer pid=212858) return await anext(self.gen) (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args (APIServer pid=212858) async_llm = AsyncLLM.from_vllm_config( (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 217, in from_vllm_config (APIServer pid=212858) return cls( (APIServer pid=212858) ^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 146, in init (APIServer pid=212858) self.engine_core = EngineCoreClient.make_async_mp_client( (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper (APIServer pid=212858) return func(*args, **kwargs) (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 130, in make_async_mp_client (APIServer pid=212858) return AsyncMPClient(*client_args) (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper (APIServer pid=212858) return func(*args, **kwargs) (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 900, in init (APIServer pid=212858) super().init( (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 535, in init (APIServer pid=212858) with launch_core_engines( (APIServer pid=212858) ^^^^^^^^^^^^^^^^^^^^ (APIServer pid=212858) File "/usr/lib64/python3.12/contextlib.py", line 144, in exit (APIServer pid=212858) next(self.gen) (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1094, in launch_core_engines (APIServer pid=212858) wait_for_engine_startup( (APIServer pid=212858) File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1153, in wait_for_engine_startup (APIServer pid=212858) raise RuntimeError( (APIServer pid=212858) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {} (vllm) [robertgshaw2-redhat@h100-02 vllm]$

Root Cause

(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00010-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00011-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00012-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00013-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00014-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00015-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00016-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 10% (2/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 20% (4/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 30% (5/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 40% (7/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 50% (8/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:869] Prefetching checkpoint files into page cache finished in 0.94s
[rank0]:[W421 21:28:15.880969765 ProcessGroupNCCL.cpp:1575] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=212858) Traceback (most recent call last):
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/bin/vllm", line 10, in <module>
(APIServer pid=212858)     sys.exit(main())
(APIServer pid=212858)              ^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=212858)     args.dispatch_function(args)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/serve.py", line 122, in cmd
(APIServer pid=212858)     uvloop.run(run_server(args))
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=212858)     return __asyncio.run(
(APIServer pid=212858)            ^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=212858)     return runner.run(main)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=212858)     return self._loop.run_until_complete(task)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=212858)     return await main
(APIServer pid=212858)            ^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 678, in run_server
(APIServer pid=212858)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 692, in run_server_worker
(APIServer pid=212858)     async with build_async_engine_client(
(APIServer pid=212858)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=212858)     return await anext(self.gen)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=212858)     async with build_async_engine_client_from_engine_args(
(APIServer pid=212858)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=212858)     return await anext(self.gen)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args
(APIServer pid=212858)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=212858)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 217, in from_vllm_config
(APIServer pid=212858)     return cls(
(APIServer pid=212858)            ^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 146, in __init__
(APIServer pid=212858)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=212858)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=212858)     return func(*args, **kwargs)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 130, in make_async_mp_client
(APIServer pid=212858)     return AsyncMPClient(*client_args)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=212858)     return func(*args, **kwargs)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 900, in __init__
(APIServer pid=212858)     super().__init__(
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 535, in __init__
(APIServer pid=212858)     with launch_core_engines(
(APIServer pid=212858)          ^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=212858)     next(self.gen)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1094, in launch_core_engines
(APIServer pid=212858)     wait_for_engine_startup(
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1153, in wait_for_engine_startup
(APIServer pid=212858)     raise RuntimeError(
(APIServer pid=212858) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
(vllm) [robertgshaw2-redhat@h100-02 vllm]$

Fix Action

Fix / Workaround

(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00010-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00011-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00012-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00013-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00014-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00015-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00016-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 10% (2/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 20% (4/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 30% (5/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 40% (7/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 50% (8/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:869] Prefetching checkpoint files into page cache finished in 0.94s
[rank0]:[W421 21:28:15.880969765 ProcessGroupNCCL.cpp:1575] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=212858) Traceback (most recent call last):
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/bin/vllm", line 10, in <module>
(APIServer pid=212858)     sys.exit(main())
(APIServer pid=212858)              ^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=212858)     args.dispatch_function(args)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/serve.py", line 122, in cmd
(APIServer pid=212858)     uvloop.run(run_server(args))
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=212858)     return __asyncio.run(
(APIServer pid=212858)            ^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=212858)     return runner.run(main)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=212858)     return self._loop.run_until_complete(task)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=212858)     return await main
(APIServer pid=212858)            ^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 678, in run_server
(APIServer pid=212858)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 692, in run_server_worker
(APIServer pid=212858)     async with build_async_engine_client(
(APIServer pid=212858)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=212858)     return await anext(self.gen)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=212858)     async with build_async_engine_client_from_engine_args(
(APIServer pid=212858)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=212858)     return await anext(self.gen)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args
(APIServer pid=212858)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=212858)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 217, in from_vllm_config
(APIServer pid=212858)     return cls(
(APIServer pid=212858)            ^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 146, in __init__
(APIServer pid=212858)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=212858)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=212858)     return func(*args, **kwargs)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 130, in make_async_mp_client
(APIServer pid=212858)     return AsyncMPClient(*client_args)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=212858)     return func(*args, **kwargs)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 900, in __init__
(APIServer pid=212858)     super().__init__(
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 535, in __init__
(APIServer pid=212858)     with launch_core_engines(
(APIServer pid=212858)          ^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=212858)     next(self.gen)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1094, in launch_core_engines
(APIServer pid=212858)     wait_for_engine_startup(
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1153, in wait_for_engine_startup
(APIServer pid=212858)     raise RuntimeError(
(APIServer pid=212858) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
(vllm) [robertgshaw2-redhat@h100-02 vllm]$

PR fix notes

PR #40615: [Bugfix] Quiet weight prefetch logs when executor is shutting down

Description (problem / solution / changelog)

Summary

  • Fixes #40564.
  • On engine-startup failure, every remaining prefetch_one task in _prefetch_all_checkpoints raises RuntimeError: cannot schedule new futures after shutdown from asyncio.to_thread, and each one is logged with a full multi-line traceback. For a 16-shard model that's a wall of ~160 lines of identical tracebacks that buries the actual startup error above.
  • This PR catches that specific RuntimeError, sets an aborted flag that short-circuits the remaining tasks, logs a single debug line, and swaps the final "finished in Xs" message for an "interrupted by shutdown after Xs" line when the abort path was taken. Non-shutdown prefetch failures still produce a full warning/traceback — only the shutdown spam is suppressed.

The function now returns the background threading.Thread so tests can .join() on it; the single existing caller ignored the return value and is unaffected (daemon thread, unchanged behavior).

Not a duplicate

gh pr list --repo vllm-project/vllm --state all --search "40564" and searches for prefetch weight_utils / cannot schedule new futures returned no PRs targeting this issue.

Test plan

  • New unit test tests/model_executor/test_weight_utils.py::test_prefetch_quiet_on_executor_shutdown — monkeypatches _prefetch_checkpoint to raise the shutdown RuntimeError, runs _prefetch_all_checkpoints on 16 fake paths, joins the background thread, and asserts zero WARNING-level records are emitted from the module logger.
  • Pre-existing tests in tests/model_executor/test_weight_utils.py are untouched.

Command to run:

.venv/bin/python -m pytest tests/model_executor/test_weight_utils.py -v

Note: I was not able to run the test locally (no uv/venv set up in this environment) — flagging this explicitly so a reviewer can verify.

AI assistance disclosure

AI assistance (Claude) was used to locate the issue, implement the fix, and draft this description. The change was reviewed line-by-line before submission.

Changed files

  • tests/model_executor/test_weight_utils.py (modified, +32/-0)
  • vllm/model_executor/model_loader/weight_utils.py (modified, +45/-10)

Code Example

Your output of `python collect_env.py` here

---

(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00010-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00011-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00012-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00013-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00014-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00015-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00016-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 10% (2/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 20% (4/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 30% (5/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 40% (7/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 50% (8/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:869] Prefetching checkpoint files into page cache finished in 0.94s
[rank0]:[W421 21:28:15.880969765 ProcessGroupNCCL.cpp:1575] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=212858) Traceback (most recent call last):
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/bin/vllm", line 10, in <module>
(APIServer pid=212858)     sys.exit(main())
(APIServer pid=212858)              ^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=212858)     args.dispatch_function(args)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/serve.py", line 122, in cmd
(APIServer pid=212858)     uvloop.run(run_server(args))
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=212858)     return __asyncio.run(
(APIServer pid=212858)            ^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=212858)     return runner.run(main)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=212858)     return self._loop.run_until_complete(task)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=212858)     return await main
(APIServer pid=212858)            ^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 678, in run_server
(APIServer pid=212858)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 692, in run_server_worker
(APIServer pid=212858)     async with build_async_engine_client(
(APIServer pid=212858)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=212858)     return await anext(self.gen)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=212858)     async with build_async_engine_client_from_engine_args(
(APIServer pid=212858)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=212858)     return await anext(self.gen)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args
(APIServer pid=212858)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=212858)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 217, in from_vllm_config
(APIServer pid=212858)     return cls(
(APIServer pid=212858)            ^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 146, in __init__
(APIServer pid=212858)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=212858)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=212858)     return func(*args, **kwargs)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 130, in make_async_mp_client
(APIServer pid=212858)     return AsyncMPClient(*client_args)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=212858)     return func(*args, **kwargs)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 900, in __init__
(APIServer pid=212858)     super().__init__(
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 535, in __init__
(APIServer pid=212858)     with launch_core_engines(
(APIServer pid=212858)          ^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=212858)     next(self.gen)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1094, in launch_core_engines
(APIServer pid=212858)     wait_for_engine_startup(
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1153, in wait_for_engine_startup
(APIServer pid=212858)     raise RuntimeError(
(APIServer pid=212858) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
(vllm) [robertgshaw2-redhat@h100-02 vllm]$
RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>
Your output of `python collect_env.py` here
</details>

🐛 Describe the bug

for example

(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00010-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00011-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00012-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00013-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00014-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00015-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Failed to prefetch checkpoint file '/mnt/nfs-preprod-1/engine/hub_cache/models--Qwen--Qwen3-30B-A3B-Instruct-2507/snapshots/0d7cf23991f47feeb3a57ecb4c9cee8ea4a17bfe/model-00016-of-00016.safetensors'.
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] Traceback (most recent call last):
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/home/robertgshaw2-redhat/vllm/vllm/model_executor/model_loader/weight_utils.py", line 846, in prefetch_one
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     await asyncio.to_thread(_prefetch_checkpoint, path)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     return await loop.run_in_executor(None, func_call)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/asyncio/base_events.py", line 867, in run_in_executor
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     executor.submit(func, *args), loop=self)
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]   File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 171, in submit
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859]     raise RuntimeError('cannot schedule new futures after shutdown')
(EngineCore pid=214323) WARNING 04-21 21:28:14 [weight_utils.py:859] RuntimeError: cannot schedule new futures after shutdown
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 10% (2/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 20% (4/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 30% (5/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 40% (7/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:851] Prefetching checkpoint files: 50% (8/16)
(EngineCore pid=214323) INFO 04-21 21:28:14 [weight_utils.py:869] Prefetching checkpoint files into page cache finished in 0.94s
[rank0]:[W421 21:28:15.880969765 ProcessGroupNCCL.cpp:1575] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=212858) Traceback (most recent call last):
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/bin/vllm", line 10, in <module>
(APIServer pid=212858)     sys.exit(main())
(APIServer pid=212858)              ^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/main.py", line 75, in main
(APIServer pid=212858)     args.dispatch_function(args)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/cli/serve.py", line 122, in cmd
(APIServer pid=212858)     uvloop.run(run_server(args))
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=212858)     return __asyncio.run(
(APIServer pid=212858)            ^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=212858)     return runner.run(main)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=212858)     return self._loop.run_until_complete(task)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/.venv/lib64/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=212858)     return await main
(APIServer pid=212858)            ^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 678, in run_server
(APIServer pid=212858)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 692, in run_server_worker
(APIServer pid=212858)     async with build_async_engine_client(
(APIServer pid=212858)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=212858)     return await anext(self.gen)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 100, in build_async_engine_client
(APIServer pid=212858)     async with build_async_engine_client_from_engine_args(
(APIServer pid=212858)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=212858)     return await anext(self.gen)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/entrypoints/openai/api_server.py", line 136, in build_async_engine_client_from_engine_args
(APIServer pid=212858)     async_llm = AsyncLLM.from_vllm_config(
(APIServer pid=212858)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 217, in from_vllm_config
(APIServer pid=212858)     return cls(
(APIServer pid=212858)            ^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/async_llm.py", line 146, in __init__
(APIServer pid=212858)     self.engine_core = EngineCoreClient.make_async_mp_client(
(APIServer pid=212858)                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=212858)     return func(*args, **kwargs)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 130, in make_async_mp_client
(APIServer pid=212858)     return AsyncMPClient(*client_args)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/tracing/otel.py", line 178, in sync_wrapper
(APIServer pid=212858)     return func(*args, **kwargs)
(APIServer pid=212858)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 900, in __init__
(APIServer pid=212858)     super().__init__(
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/core_client.py", line 535, in __init__
(APIServer pid=212858)     with launch_core_engines(
(APIServer pid=212858)          ^^^^^^^^^^^^^^^^^^^^
(APIServer pid=212858)   File "/usr/lib64/python3.12/contextlib.py", line 144, in __exit__
(APIServer pid=212858)     next(self.gen)
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1094, in launch_core_engines
(APIServer pid=212858)     wait_for_engine_startup(
(APIServer pid=212858)   File "/home/robertgshaw2-redhat/vllm/vllm/v1/engine/utils.py", line 1153, in wait_for_engine_startup
(APIServer pid=212858)     raise RuntimeError(
(APIServer pid=212858) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
(vllm) [robertgshaw2-redhat@h100-02 vllm]$

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

The issue is likely caused by the engine core failing to initialize due to a RuntimeError when scheduling new futures after shutdown, and a potential solution is to review the engine core initialization process and ensure proper shutdown and restart procedures.

Guidance

  • Review the weight_utils.py file, specifically the prefetch_one function, to ensure that it is properly handling the asyncio.to_thread call and the _prefetch_checkpoint function.
  • Check the engine core initialization process in engine/utils.py to ensure that it is properly waiting for engine startup and handling any potential errors.
  • Verify that the launch_core_engines context manager in engine/utils.py is properly handling the engine core startup and shutdown procedures.
  • Investigate the possibility of a resource leak due to the destroy_process_group() function not being called before program exit, as warned in the ProcessGroupNCCL.cpp file.

Example

No specific code example is provided, as the issue is complex and requires a thorough review of the codebase.

Notes

The issue seems to be related to the asynchronous processing of checkpoint files and the engine core initialization. It is recommended to review the code and ensure that all asynchronous tasks are properly handled and shut down to prevent resource leaks.

Recommendation

Apply a workaround by reviewing and modifying the engine core initialization process to ensure proper shutdown and restart procedures, and investigate the possibility of a resource leak due to the destroy_process_group() function not being called before program exit.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - ✅(Solved) Fix [Bug]: UX of Weight Prefetchers on Startup Failure [1 pull requests, 1 comments, 2 participants]