vllm - ✅(Solved) Fix [Bug]: worker process remains after SIGKILL the serve process for Qwen3omni model [1 pull requests, 1 comments, 2 participants]

vllm2026-05-19 06:25:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#43060•Fetched 2026-05-20 03:40:06

View on GitHub

Comments

Participants

Timeline

Reactions

Author

zhumingjue138

Participants

jshaofa-ui

zhumingjue138

Timeline (top)

commented ×1cross-referenced ×1labeled ×1

Fix Action

Fixed

Fixed by PR: [Test] Add scenarios for L5 reliability test (https://github.com/vllm-project/vllm-omni/pull/3729)

PR fix notes

PR #3729: [Test] Add scenarios for L5 reliability test

Repository: vllm-project/vllm-omni
Author: zhumingjue138
State: open | merged: False
Link: https://github.com/vllm-project/vllm-omni/pull/3729

Description (problem / solution / changelog)

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

RFC #2366

Test Plan

pytest -s -v tests/dfx/reliability/test_reliability_qwen3_omni.py

pytest -s -v tests/dfx/reliability/test_reliability_wan22.py

Test Result

qwen3omni =========== 11 passed, 9 skipped, 18 warnings in 3194.82s (0:53:14) ============

skipped test case

case	issue
test_reliability_fault_gpu_oom_state_converges_after_fault_removed	OOM injection cause serving instance to exit
test_reliability_fault_process_kill_tree_with_load_fast_fail_and_cleanup[omni_server_function0-sigterm]	#3683
test_reliability_fault_process_kill_serve_root_with_load_fast_fail_and_cleanup[sigterm-sigint-sigkill]	#3683
test_reliability_fault_process_kill_serve_root_no_load_fast_fail_and_cleanup[sigkill]	vllm #43060
test_reliability_fault_process_kill_worker_with_load_request_failure	#3683

wan22 =========== 15 passed, 3 skipped, 18 warnings in 1476.81s (0:24:36) ============

skipped test case

case	issue
test_reliability_video_oom_recovers_after_fault_removed	#2327
test_reliability_fault_process_kill_serve_root_with_load_fast_fail_and_cleanup[sigkill]	#3725
test_reliability_fault_process_kill_serve_root_no_load_fast_fail_and_cleanup[sigkill]	#3725

<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>

[√] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
[√] The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
[√] The test results. Please paste the results comparison before and after, or the e2e results.
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
(Optional) Release notes update. If your change is user-facing, please update the release notes draft.

</details>

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Changed files

tests/dfx/reliability/helpers.py (modified, +452/-25)
tests/dfx/reliability/test_reliability_qwen3_omni.py (modified, +320/-64)
tests/dfx/reliability/test_reliability_wan22.py (modified, +273/-51)

Code Example

Your output of `python collect_env.py` here

---

root           1       0  0 May12 pts/0    00:00:00 /bin/bash
root          81       0  0 May12 pts/1    00:00:00 bash
root        8060       0  0 May12 pts/2    00:00:00 bash
root      530020      81 25 06:14 pts/1    00:02:01 /workspace/.venv/bin/python3 -m vllm_omni.entrypoints.cli.main serve /nvme1n1p1/models/Qwen/Qwen3-Omni-30B-A3B-Instruct --host 127.
root      530283  530020  0 06:15 pts/1    00:00:00 /workspace/.venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(57)
root      530284  530020 23 06:15 pts/1    00:01:50 VLLM::StageEngineCoreProc_DP0
root      530287  530020 20 06:15 pts/1    00:01:33 VLLM::StageEngineCoreProc_DP0
root      531366  530020 19 06:16 pts/1    00:01:10 VLLM::StageEngineCoreProc_DP0
root      532096    8060  0 06:22 pts/2    00:00:00 ps -ef

---

root           1       0  0 May12 pts/0    00:00:00 /bin/bash
root          81       0  0 May12 pts/1    00:00:00 bash
root        8060       0  0 May12 pts/2    00:00:00 bash
root      530283       1  0 06:15 pts/1    00:00:00 /workspace/.venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(57)
root      530284       1 23 06:15 pts/1    00:01:51 VLLM::StageEngineCoreProc_DP0
root      530287       1 19 06:15 pts/1    00:01:34 VLLM::StageEngineCoreProc_DP0
root      531366       1 18 06:16 pts/1    00:01:11 VLLM::StageEngineCoreProc_DP0
root      532098    8060  0 06:23 pts/2    00:00:00 ps -ef

---

(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:46] Route: /v1/omni/sleep, Methods: POST
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:46] Route: /v1/omni/wakeup, Methods: POST
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/audio/speech/stream, Endpoint: streaming_speech
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/video/chat/stream, Endpoint: streaming_video_chat
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/realtime, Endpoint: realtime_websocket
(APIServer pid=530020) INFO:     Started server process [530020]
(APIServer pid=530020) INFO:     Waiting for application startup.
(APIServer pid=530020) INFO:     Application startup complete.
Killed

RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>

Your output of `python collect_env.py` here

</details>

🐛 Describe the bug

/workspace/.venv/bin/python3 -m vllm_omni.entrypoints.cli.main serve /nvme1n1p1/models/Qwen/Qwen3-Omni-30B-A3B-Instruct --host 127.0.0.1 --port 53211 --omni --async-chunk --stage-init-timeout 600 --init-timeout 900 --log-stats --stage-configs-path /nvme1n1p1/xxx/vllm-omni-zmj-reli/vllm-omni/vllm_omni/deploy/qwen3_omni_moe.yaml

root           1       0  0 May12 pts/0    00:00:00 /bin/bash
root          81       0  0 May12 pts/1    00:00:00 bash
root        8060       0  0 May12 pts/2    00:00:00 bash
root      530020      81 25 06:14 pts/1    00:02:01 /workspace/.venv/bin/python3 -m vllm_omni.entrypoints.cli.main serve /nvme1n1p1/models/Qwen/Qwen3-Omni-30B-A3B-Instruct --host 127.
root      530283  530020  0 06:15 pts/1    00:00:00 /workspace/.venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(57)
root      530284  530020 23 06:15 pts/1    00:01:50 VLLM::StageEngineCoreProc_DP0
root      530287  530020 20 06:15 pts/1    00:01:33 VLLM::StageEngineCoreProc_DP0
root      531366  530020 19 06:16 pts/1    00:01:10 VLLM::StageEngineCoreProc_DP0
root      532096    8060  0 06:22 pts/2    00:00:00 ps -ef

kill -9 530020

root           1       0  0 May12 pts/0    00:00:00 /bin/bash
root          81       0  0 May12 pts/1    00:00:00 bash
root        8060       0  0 May12 pts/2    00:00:00 bash
root      530283       1  0 06:15 pts/1    00:00:00 /workspace/.venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(57)
root      530284       1 23 06:15 pts/1    00:01:51 VLLM::StageEngineCoreProc_DP0
root      530287       1 19 06:15 pts/1    00:01:34 VLLM::StageEngineCoreProc_DP0
root      531366       1 18 06:16 pts/1    00:01:11 VLLM::StageEngineCoreProc_DP0
root      532098    8060  0 06:23 pts/2    00:00:00 ps -ef

(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:46] Route: /v1/omni/sleep, Methods: POST
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:46] Route: /v1/omni/wakeup, Methods: POST
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/audio/speech/stream, Endpoint: streaming_speech
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/video/chat/stream, Endpoint: streaming_video_chat
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/realtime, Endpoint: realtime_websocket
(APIServer pid=530020) INFO:     Started server process [530020]
(APIServer pid=530020) INFO:     Waiting for application startup.
(APIServer pid=530020) INFO:     Application startup complete.
Killed

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #retrieval issue #search optimization #API routing #API middleware

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - ✅(Solved) Fix [Bug]: worker process remains after SIGKILL the serve process for Qwen3omni model [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #3729: [Test] Add scenarios for L5 reliability test

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Still need to ship something?

TRENDING

vllm - ✅(Solved) Fix [Bug]: worker process remains after SIGKILL the serve process for Qwen3omni model [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #3729: [Test] Add scenarios for L5 reliability test

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Still need to ship something?

RELATED_DISCOVERY

TRENDING