vllm - 💡(How to fix) Fix [Tracking Issue]: NIXL P/D Disaggregation for Hybrid Models [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#40017Fetched 2026-04-17 08:27:35
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants
RAW_BUFFERClick to expand / collapse

Hybrid Models

  • Nemotron/Mamba2
    • Homogeneous TP — #36687 (merged)
    • Heterogeneous TP (3-read conv state transfer) — #37635 (merged)
    • DS conv state layout — #37416 (merged)
  • Gemma4 (WIP)
  • Qwen3.5

Refactor

  • Unify Transfer Topology (WIP)
  • Keep model-specific logics clean (WIP)

Features

  • Nemotron/Mamba2
    • Hetero TP
    • with prefix caching
    • with spec decode

extent analysis

TL;DR

Complete the implementation of the "Unify Transfer Topology" refactor to potentially resolve inconsistencies and improve model compatibility.

Guidance

  • Review the "Unify Transfer Topology" refactor task to ensure it addresses the differences between Nemotron/Mamba2 and other models like Gemma4 and Qwen3.5.
  • Verify that the "Heterogeneous TP" feature is correctly implemented for Nemotron/Mamba2 and consider its applicability to other models.
  • Check the status of the "Keep model-specific logics clean" task to prevent logic duplication and ensure a unified approach across models.
  • Investigate how the "prefix caching" and "spec decode" features for Nemotron/Mamba2 could be generalized for other models.

Notes

The provided information lacks specific technical details about the implementation, so the guidance is focused on high-level suggestions for refactoring and feature implementation.

Recommendation

Apply workaround: Focus on completing the "Unify Transfer Topology" refactor to create a more consistent and adaptable codebase for various models.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING