vllm - ✅(Solved) Fix [Bug]: [Bug]: Kimi-K2.5 on version 0.18.0 results in an keyerror when the pipeline parallelism (PP) is greater than or equal to 2 [1 pull requests, 3 comments, 4 participants]

vllm2026-03-24 07:53:21

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#37974•Fetched 2026-04-08 01:22:19

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3cross-referenced ×1labeled ×1subscribed ×1

Error Message

EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] Error executing method 'initialize_from_config'. This might cause deadlock in distributed execution. [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] Traceback (most recent call last): [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/ray_utils.py", line 65, in execute_method [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] return run_method(self, method, args, kwargs) [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/serial_utils.py", line 459, in run_method [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] return func(*args, **kwargs) [repeated 98x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] ^^^^^^^^^^^^^^^^^^^^^ [repeated 98x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] File "/usr/local/lib/python3.12/dist-packages/ray/util/tracing/tracing_helper.py", line 461, in _resume_span [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] return method(self, *_args, **_kwargs) [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 556, in initialize_from_config [repeated 98x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] self.worker.initialize_from_config(kv_cache_config) # type: ignore [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] self.model_runner.initialize_kv_cache(kv_cache_config) [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 6481, in initialize_kv_cache [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] self.initialize_attn_backend(kv_cache_config) [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 5904, in initialize_attn_backend [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] attn_backends = get_attn_backends_for_group(kv_cache_group_spec) [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 5863, in get_attn_backends_for_group [repeated 49x across cluste r]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] attn_backend = layers[layer_name].get_attn_backend() [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] ~~~~~~^^^^^^^^^^^^ [repeated 49x across cluster]
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] KeyError: 'language_model.model.layers.30.self_attn.attn' [repeated 49x across cluster]

PR fix notes

PR #38087: [Bugfix] fix: normalize layer names for kv cache group to prevent KeyError in

Repository: vllm-project/vllm
Author: mahendrarathore1742
State: open | merged: False
Link: https://github.com/vllm-project/vllm/pull/38087

Description (problem / solution / changelog)

Bug:- When running Kimi-K2.5 with vLLM (v0.18.0) in distributed or pipeline parallel mode (e.g., 64 H100, TP=8, PP=8), vLLM fails to start with a repeated KeyError:

KeyError: 'language_model.model.layers.XX.self_attn.attn'

This occurs because some layer names in the kv cache group spec include an extra .model. prefix, while the actual model config omits it. This mismatch causes failures in kv cache initialization and distributed attention backend setup.

### Fix

Added normalization logic in the kv cache group layer name resolution to handle .model. prefix mismatches.
Now, both 'language_model.model.layers.XX.self_attn.attn' and 'language_model.layers.XX.self_attn.attn' are mapped correctly, preventing KeyErrors.
Added a unit test to verify the normalization logic and ensure future regressions are caught.

### Test

The new unit test for layer name normalization passes.
All core and optional dependencies are installed.
Remaining test errors are unrelated to this patch (environment/hardware/optional package issues).

fix #37974

Signed-off-by: mahendrarathore1742

Changed files

vllm/config/vllm.py (modified, +53/-8)
vllm/tests/unit/test_get_layers_normalization.py (added, +50/-0)

Code Example

Your output of `python collect_env.py` here

---

EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] Error executing method 'initialize_from_config'. This might cause deadlock in distributed execution. [repeated 49x across cluster]                        
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] Traceback (most recent call last): [repeated 49x across cluster]                                                                                          
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/ray_utils.py", line 65, in execute_method [repeated 49x across cluster]                  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     return run_method(self, method, args, kwargs) [repeated 49x across cluster]                                                                           
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]                                                                           
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/serial_utils.py", line 459, in run_method [repeated 49x across cluster]                           
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     return func(*args, **kwargs) [repeated 98x across cluster]                                                                                            
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]            ^^^^^^^^^^^^^^^^^^^^^ [repeated 98x across cluster]                                                                                            
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/ray/util/tracing/tracing_helper.py", line 461, in _resume_span [repeated 49x across cluster]              
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     return method(self, *_args, **_kwargs) [repeated 49x across cluster]                                                                                  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]                                                                                  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 556, in initialize_from_config [repeated 98x across cluster]          
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     self.worker.initialize_from_config(kv_cache_config)  # type: ignore [repeated 49x across cluster]                                                     
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]                                                                     
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper [repeated 49x across cluster]                            
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     self.model_runner.initialize_kv_cache(kv_cache_config) [repeated 49x across cluster]                                                                  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 6481, in initialize_kv_cache [repeated 49x across cluster]      
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     self.initialize_attn_backend(kv_cache_config) [repeated 49x across cluster]                                                                           
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 5904, in initialize_attn_backend [repeated 49x across cluster]  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     attn_backends = get_attn_backends_for_group(kv_cache_group_spec) [repeated 49x across cluster]                                                        
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]                                                        
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 5863, in get_attn_backends_for_group [repeated 49x across cluste
r]                                                                                                                                                                                                                                                                    
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     attn_backend = layers[layer_name].get_attn_backend() [repeated 49x across cluster]                                                                    
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]                    ~~~~~~^^^^^^^^^^^^ [repeated 49x across cluster]                                                                                       
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] **KeyError: 'language_model.model.layers.30.self_attn.attn'** [repeated 49x across cluster]

RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>

Your output of `python collect_env.py` here

</details>

🐛 Describe the bug

I am using 64 H100 to run Kimi-K2.5 with vllm-0.18.0, and this is my startup command:

''' bash ./multi-node-serving.sh leader --ray_port=6379 --ray_cluster_size=8 &&
vllm serve /models/preset/moonshotai/Kimi-K2.5/v1.0
--port 8087
--distributed-executor-backend ray
--trust-remote-code
--tensor-parallel-size 8
--pipeline-parallel-size 8
--tool-call-parser kimi_k2
--reasoning-parser kimi_k2 '''

but it's failed with KeyError: 'language_model.model.layers.30.self_attn.attn' [repeated 49x across cluster] :

'''

EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] Error executing method 'initialize_from_config'. This might cause deadlock in distributed execution. [repeated 49x across cluster]                        
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] Traceback (most recent call last): [repeated 49x across cluster]                                                                                          
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/ray_utils.py", line 65, in execute_method [repeated 49x across cluster]                  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     return run_method(self, method, args, kwargs) [repeated 49x across cluster]                                                                           
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]                                                                           
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/serial_utils.py", line 459, in run_method [repeated 49x across cluster]                           
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     return func(*args, **kwargs) [repeated 98x across cluster]                                                                                            
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]            ^^^^^^^^^^^^^^^^^^^^^ [repeated 98x across cluster]                                                                                            
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/ray/util/tracing/tracing_helper.py", line 461, in _resume_span [repeated 49x across cluster]              
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     return method(self, *_args, **_kwargs) [repeated 49x across cluster]                                                                                  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]                                                                                  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 556, in initialize_from_config [repeated 98x across cluster]          
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     self.worker.initialize_from_config(kv_cache_config)  # type: ignore [repeated 49x across cluster]                                                     
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]                                                                     
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper [repeated 49x across cluster]                            
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     self.model_runner.initialize_kv_cache(kv_cache_config) [repeated 49x across cluster]                                                                  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 6481, in initialize_kv_cache [repeated 49x across cluster]      
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     self.initialize_attn_backend(kv_cache_config) [repeated 49x across cluster]                                                                           
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 5904, in initialize_attn_backend [repeated 49x across cluster]  
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     attn_backends = get_attn_backends_for_group(kv_cache_group_spec) [repeated 49x across cluster]                                                        
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [repeated 49x across cluster]                                                        
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 5863, in get_attn_backends_for_group [repeated 49x across cluste
r]                                                                                                                                                                                                                                                                    
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]     attn_backend = layers[layer_name].get_attn_backend() [repeated 49x across cluster]                                                                    
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74]                    ~~~~~~^^^^^^^^^^^^ [repeated 49x across cluster]                                                                                       
(EngineCore pid=114613) (RayWorkerWrapper pid=29113, ip=10.45.7.181) ERROR 03-24 15:39:40 [ray_utils.py:74] **KeyError: 'language_model.model.layers.30.self_attn.attn'** [repeated 49x across cluster]

'''

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

Fix Plan

The error message indicates a KeyError exception, which suggests that the code is trying to access a key that does not exist in a dictionary. To fix this issue, we need to ensure that the key 'language_model.model.layers.30.self_attn.attn' exists in the dictionary before trying to access it.

Here are the steps to fix the issue:

Check the model configuration and ensure that the layer 'language_model.model.layers.30.self_attn.attn' is correctly defined.
Verify that the model is correctly loaded and the layers are properly initialized.
Modify the code to handle the case where the key does not exist. This can be done by using the get() method of the dictionary, which returns None if the key does not exist.

Example code:

# Assuming layers is a dictionary
attn_backend = layers.get('language_model.model.layers.30.self_attn.attn')
if attn_backend is None:
    # Handle the case where the key does not exist
    print("Key 'language_model.model.layers.30.self_attn.attn' does not exist")
else:
    # Use the attn_backend
    pass

Alternatively, you can use a try-except block to catch the KeyError exception and handle it:

try:
    attn_backend = layers['language_model.model.layers.30.self_attn.attn']
except KeyError:
    # Handle the case where the key does not exist
    print("Key 'language_model.model.layers.30.self_attn.attn' does not exist")

Verification

To verify that the fix worked, run the code again and check if the KeyError exception is still raised. If the exception is not raised, it means that the fix was successful.

Extra Tips

Make sure to test the code thoroughly after applying the fix to ensure that it works as expected.
Consider adding error handling mechanisms to handle similar errors in the future.
If the issue persists, try to debug the code to identify the root cause of the problem.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#runtime error #dependency conflict #environment setup #docker error #permission error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - ✅(Solved) Fix [Bug]: [Bug]: Kimi-K2.5 on version 0.18.0 results in an keyerror when the pipeline parallelism (PP) is greater than or equal to 2 [1 pull requests, 3 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #38087: [Bugfix] fix: normalize layer names for kv cache group to prevent KeyError in

Description (problem / solution / changelog)

Changed files

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

vllm - ✅(Solved) Fix [Bug]: [Bug]: Kimi-K2.5 on version 0.18.0 results in an keyerror when the pipeline parallelism (PP) is greater than or equal to 2 [1 pull requests, 3 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #38087: [Bugfix] fix: normalize layer names for kv cache group to prevent KeyError in

Description (problem / solution / changelog)

Changed files

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING