vllm - ✅(Solved) Fix [CI Failure]: LoRA TP (Distributed) _get_lora_aux_cuda_stream is not defined [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#39804Fetched 2026-04-16 06:36:34
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×2added_to_project_v2 ×1labeled ×1

Error Message

Loading safetensors checkpoint shards: 100% 3/3 [00:16<00:00, 5.49s/it]

[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [default_loader.py:384] Loading weights took 16.75 seconds [2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [unquantized.py:340] Using MoEPrepareAndFinalizeNoDPEPModular [2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [utils.py:99] MoE model detected. Using fused MoE LoRA implementation. [2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [punica_selector.py:20] Using PunicaWrapperGPU. [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] EngineCore failed to start. [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] Traceback (most recent call last): [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1084, in run_engine_core [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs) [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 850, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] super().init( [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 116, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] self.model_executor = executor_class(vllm_config) [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] self._init_executor() [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] self.driver_worker.load_model() [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] self.model_runner.load_model(load_dummy_weights=load_dummy_weights) [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4770, in load_model [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] self.model = self.load_lora_model( [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] ^^^^^^^^^^^^^^^^^^^^^ [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/lora_model_runner_mixin.py", line 46, in load_lora_model [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] return self.lora_manager.create_lora_manager(model, vllm_config) [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/worker_manager.py", line 265, in create_lora_manager [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] lora_manager = create_lora_manager( [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] ^^^^^^^^^^^^^^^^^^^^ [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 944, in create_lora_manager [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] lora_manager = lora_manager_cls( [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] ^^^^^^^^^^^^^^^^^ [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 856, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] super().init( [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 115, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] self._create_lora_modules() [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 406, in _create_lora_modules [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] from_layer( [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/utils.py", line 119, in from_layer [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] instance_layer = lora_cls(layer) [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] ^^^^^^^^^^^^^^^ [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 368, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] super().init(base_layer) [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 186, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] super().init(base_layer) [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 85, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] super().init(base_layer) [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/base_linear.py", line 87, in init [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] self._init_lora_stream_context() [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/base_linear.py", line 96, in _init_lora_stream_context [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] self._lora_stream = _get_lora_aux_cuda_stream() [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] ^^^^^^^^^^^^^^^^^^^^^^^^^ [2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] NameError: name '_get_lora_aux_cuda_stream' is not defined [2026-04-14T06:24:28Z] (EngineCore pid=5105) Process EngineCore:  

<br class="Apple-interchange-newline">

Root Cause

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

Fix Action

Fixed

PR fix notes

PR #39834: fix(lora): define _get_lora_aux_cuda_stream unconditionally to prevent NameError

Description (problem / solution / changelog)

_get_lora_aux_cuda_stream was defined inside if envs.VLLM_LORA_ENABLE_DUAL_STREAM: at module import time, but self._enable_aux_cuda_stream re-reads the same env var at object creation time. Since the EngineCore subprocess is launched via fork by default, the env var value at fork time can differ from its value when base_linear.py was first imported, causing a NameError when _init_lora_stream_context tries to call the undefined function. Moving _get_lora_aux_cuda_stream outside the conditional block ensures it is always available; the lora_linear_async custom op registration stays conditional since it is a one-time side effect that should only happen when dual stream is enabled. Fixes #39804

Changed files

  • vllm/lora/layers/base_linear.py (modified, +42/-39)

Code Example

pytest -v -s -x lora/test_olmoe_tp.py

---

Loading safetensors checkpoint shards: 100% 3/3 [00:16<00:00,  5.49s/it]
--
[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [default_loader.py:384] Loading weights took 16.75 seconds
[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [unquantized.py:340] Using MoEPrepareAndFinalizeNoDPEPModular
[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [utils.py:99] MoE model detected. Using fused MoE LoRA implementation.
[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [punica_selector.py:20] Using PunicaWrapperGPU.
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] EngineCore failed to start.
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] Traceback (most recent call last):
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1084, in run_engine_core
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 850, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 116, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self.model_executor = executor_class(vllm_config)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self._init_executor()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self.driver_worker.load_model()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4770, in load_model
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self.model = self.load_lora_model(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                  ^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/lora_model_runner_mixin.py", line 46, in load_lora_model
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     return self.lora_manager.create_lora_manager(model, vllm_config)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/worker_manager.py", line 265, in create_lora_manager
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     lora_manager = create_lora_manager(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                    ^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 944, in create_lora_manager
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     lora_manager = lora_manager_cls(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                    ^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 856, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 115, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self._create_lora_modules()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 406, in _create_lora_modules
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     from_layer(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/utils.py", line 119, in from_layer
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     instance_layer = lora_cls(layer)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                      ^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 368, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(base_layer)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 186, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(base_layer)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 85, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(base_layer)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/base_linear.py", line 87, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self._init_lora_stream_context()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/base_linear.py", line 96, in _init_lora_stream_context
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self._lora_stream = _get_lora_aux_cuda_stream()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                         ^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] NameError: name '_get_lora_aux_cuda_stream' is not defined
[2026-04-14T06:24:28Z] (EngineCore pid=5105) Process EngineCore:
 

<br class="Apple-interchange-newline">
RAW_BUFFERClick to expand / collapse

Name of failing test

lora/test_olmoe_tp.py

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

LoRA TP (Distributed) test failure

pytest -v -s -x lora/test_olmoe_tp.py

due to


Loading safetensors checkpoint shards: 100% 3/3 [00:16<00:00,  5.49s/it]
--
[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [default_loader.py:384] Loading weights took 16.75 seconds
[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [unquantized.py:340] Using MoEPrepareAndFinalizeNoDPEPModular
[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [utils.py:99] MoE model detected. Using fused MoE LoRA implementation.
[2026-04-14T06:24:27Z] (EngineCore pid=5105) INFO 04-14 06:24:27 [punica_selector.py:20] Using PunicaWrapperGPU.
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] EngineCore failed to start.
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] Traceback (most recent call last):
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 1084, in run_engine_core
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     engine_core = EngineCoreProc(*args, engine_index=dp_rank, **kwargs)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 850, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 116, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self.model_executor = executor_class(vllm_config)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/abstract.py", line 109, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self._init_executor()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/uniproc_executor.py", line 52, in _init_executor
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self.driver_worker.load_model()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4770, in load_model
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self.model = self.load_lora_model(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                  ^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/lora_model_runner_mixin.py", line 46, in load_lora_model
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     return self.lora_manager.create_lora_manager(model, vllm_config)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/worker_manager.py", line 265, in create_lora_manager
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     lora_manager = create_lora_manager(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                    ^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 944, in create_lora_manager
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     lora_manager = lora_manager_cls(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                    ^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 856, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 115, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self._create_lora_modules()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/model_manager.py", line 406, in _create_lora_modules
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     from_layer(
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/utils.py", line 119, in from_layer
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     instance_layer = lora_cls(layer)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                      ^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 368, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(base_layer)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 186, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(base_layer)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/column_parallel_linear.py", line 85, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     super().__init__(base_layer)
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/base_linear.py", line 87, in __init__
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self._init_lora_stream_context()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/layers/base_linear.py", line 96, in _init_lora_stream_context
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]     self._lora_stream = _get_lora_aux_cuda_stream()
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110]                         ^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-04-14T06:24:28Z] (EngineCore pid=5105) ERROR 04-14 06:24:28 [core.py:1110] NameError: name '_get_lora_aux_cuda_stream' is not defined
[2026-04-14T06:24:28Z] (EngineCore pid=5105) Process EngineCore:
 

<br class="Apple-interchange-newline">

📝 History of failing test

Not failed in previous nightly.

CC List.

No response

extent analysis

TL;DR

The most likely fix is to define the _get_lora_aux_cuda_stream function, which is currently not defined and causing a NameError.

Guidance

  • Check the base_linear.py file for any missing imports or function definitions related to _get_lora_aux_cuda_stream.
  • Verify that the _get_lora_aux_cuda_stream function is defined in the correct scope and is accessible from the base_linear.py file.
  • If the function is defined in another file, ensure that it is properly imported and accessible.
  • Review the code changes made recently to identify any potential causes for the _get_lora_aux_cuda_stream function to become undefined.

Example

No code example is provided as the issue is related to a missing function definition, and without more context, it's difficult to provide a specific solution.

Notes

The error message indicates that the _get_lora_aux_cuda_stream function is not defined, which suggests that there might be a missing import or a function definition issue. It's essential to review the code and identify the cause of the missing function definition.

Recommendation

Apply a workaround by defining the _get_lora_aux_cuda_stream function or fixing the import issue, as the error message clearly indicates that this function is not defined.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - ✅(Solved) Fix [CI Failure]: LoRA TP (Distributed) _get_lora_aux_cuda_stream is not defined [1 pull requests, 1 participants]