vllm - ✅(Solved) Fix [Bug]: worker process remains after SIGKILL the serve process for Qwen3omni model [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#43060Fetched 2026-05-20 03:40:06
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Timeline (top)
commented ×1cross-referenced ×1labeled ×1

Fix Action

Fixed

PR fix notes

PR #3729: [Test] Add scenarios for L5 reliability test

Description (problem / solution / changelog)

<!-- markdownlint-disable -->

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

RFC #2366

Test Plan

pytest -s -v tests/dfx/reliability/test_reliability_qwen3_omni.py

pytest -s -v tests/dfx/reliability/test_reliability_wan22.py

Test Result

qwen3omni =========== 11 passed, 9 skipped, 18 warnings in 3194.82s (0:53:14) ============

skipped test case

caseissue
test_reliability_fault_gpu_oom_state_converges_after_fault_removedOOM injection cause serving instance to exit
test_reliability_fault_process_kill_tree_with_load_fast_fail_and_cleanup[omni_server_function0-sigterm]#3683
test_reliability_fault_process_kill_serve_root_with_load_fast_fail_and_cleanup[sigterm-sigint-sigkill]#3683
test_reliability_fault_process_kill_serve_root_no_load_fast_fail_and_cleanup[sigkill]vllm #43060
test_reliability_fault_process_kill_worker_with_load_request_failure#3683

wan22 =========== 15 passed, 3 skipped, 18 warnings in 1476.81s (0:24:36) ============

skipped test case

caseissue
test_reliability_video_oom_recovers_after_fault_removed#2327
test_reliability_fault_process_kill_serve_root_with_load_fast_fail_and_cleanup[sigkill]#3725
test_reliability_fault_process_kill_serve_root_no_load_fast_fail_and_cleanup[sigkill]#3725

<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • [√] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • [√] The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • [√] The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.
</details>

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Changed files

  • tests/dfx/reliability/helpers.py (modified, +452/-25)
  • tests/dfx/reliability/test_reliability_qwen3_omni.py (modified, +320/-64)
  • tests/dfx/reliability/test_reliability_wan22.py (modified, +273/-51)

Code Example

Your output of `python collect_env.py` here

---

root           1       0  0 May12 pts/0    00:00:00 /bin/bash
root          81       0  0 May12 pts/1    00:00:00 bash
root        8060       0  0 May12 pts/2    00:00:00 bash
root      530020      81 25 06:14 pts/1    00:02:01 /workspace/.venv/bin/python3 -m vllm_omni.entrypoints.cli.main serve /nvme1n1p1/models/Qwen/Qwen3-Omni-30B-A3B-Instruct --host 127.
root      530283  530020  0 06:15 pts/1    00:00:00 /workspace/.venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(57)
root      530284  530020 23 06:15 pts/1    00:01:50 VLLM::StageEngineCoreProc_DP0
root      530287  530020 20 06:15 pts/1    00:01:33 VLLM::StageEngineCoreProc_DP0
root      531366  530020 19 06:16 pts/1    00:01:10 VLLM::StageEngineCoreProc_DP0
root      532096    8060  0 06:22 pts/2    00:00:00 ps -ef

---

root           1       0  0 May12 pts/0    00:00:00 /bin/bash
root          81       0  0 May12 pts/1    00:00:00 bash
root        8060       0  0 May12 pts/2    00:00:00 bash
root      530283       1  0 06:15 pts/1    00:00:00 /workspace/.venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(57)
root      530284       1 23 06:15 pts/1    00:01:51 VLLM::StageEngineCoreProc_DP0
root      530287       1 19 06:15 pts/1    00:01:34 VLLM::StageEngineCoreProc_DP0
root      531366       1 18 06:16 pts/1    00:01:11 VLLM::StageEngineCoreProc_DP0
root      532098    8060  0 06:23 pts/2    00:00:00 ps -ef

---

(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:46] Route: /v1/omni/sleep, Methods: POST
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:46] Route: /v1/omni/wakeup, Methods: POST
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/audio/speech/stream, Endpoint: streaming_speech
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/video/chat/stream, Endpoint: streaming_video_chat
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/realtime, Endpoint: realtime_websocket
(APIServer pid=530020) INFO:     Started server process [530020]
(APIServer pid=530020) INFO:     Waiting for application startup.
(APIServer pid=530020) INFO:     Application startup complete.
Killed
RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>
Your output of `python collect_env.py` here
</details>

🐛 Describe the bug

/workspace/.venv/bin/python3 -m vllm_omni.entrypoints.cli.main serve /nvme1n1p1/models/Qwen/Qwen3-Omni-30B-A3B-Instruct --host 127.0.0.1 --port 53211 --omni --async-chunk --stage-init-timeout 600 --init-timeout 900 --log-stats --stage-configs-path /nvme1n1p1/xxx/vllm-omni-zmj-reli/vllm-omni/vllm_omni/deploy/qwen3_omni_moe.yaml

root           1       0  0 May12 pts/0    00:00:00 /bin/bash
root          81       0  0 May12 pts/1    00:00:00 bash
root        8060       0  0 May12 pts/2    00:00:00 bash
root      530020      81 25 06:14 pts/1    00:02:01 /workspace/.venv/bin/python3 -m vllm_omni.entrypoints.cli.main serve /nvme1n1p1/models/Qwen/Qwen3-Omni-30B-A3B-Instruct --host 127.
root      530283  530020  0 06:15 pts/1    00:00:00 /workspace/.venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(57)
root      530284  530020 23 06:15 pts/1    00:01:50 VLLM::StageEngineCoreProc_DP0
root      530287  530020 20 06:15 pts/1    00:01:33 VLLM::StageEngineCoreProc_DP0
root      531366  530020 19 06:16 pts/1    00:01:10 VLLM::StageEngineCoreProc_DP0
root      532096    8060  0 06:22 pts/2    00:00:00 ps -ef

kill -9 530020

root           1       0  0 May12 pts/0    00:00:00 /bin/bash
root          81       0  0 May12 pts/1    00:00:00 bash
root        8060       0  0 May12 pts/2    00:00:00 bash
root      530283       1  0 06:15 pts/1    00:00:00 /workspace/.venv/bin/python3 -c from multiprocessing.resource_tracker import main;main(57)
root      530284       1 23 06:15 pts/1    00:01:51 VLLM::StageEngineCoreProc_DP0
root      530287       1 19 06:15 pts/1    00:01:34 VLLM::StageEngineCoreProc_DP0
root      531366       1 18 06:16 pts/1    00:01:11 VLLM::StageEngineCoreProc_DP0
root      532098    8060  0 06:23 pts/2    00:00:00 ps -ef
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:46] Route: /v1/omni/sleep, Methods: POST
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:46] Route: /v1/omni/wakeup, Methods: POST
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/audio/speech/stream, Endpoint: streaming_speech
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/video/chat/stream, Endpoint: streaming_video_chat
(APIServer pid=530020) INFO 05-19 06:17:38 [launcher.py:57] Route: /v1/realtime, Endpoint: realtime_websocket
(APIServer pid=530020) INFO:     Started server process [530020]
(APIServer pid=530020) INFO:     Waiting for application startup.
(APIServer pid=530020) INFO:     Application startup complete.
Killed
<img width="784" height="47" alt="Image" src="https://github.com/user-attachments/assets/cb0e77ee-33f5-4f7f-8756-153a4c7f6345" />

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - ✅(Solved) Fix [Bug]: worker process remains after SIGKILL the serve process for Qwen3omni model [1 pull requests, 1 comments, 2 participants]