vllm - 💡(How to fix) Fix [Roadmap] 2026 Q2 vLLM × RL Roadmap [9 comments, 6 participants]

aoshen02 · 2026-05-05T13:35:19Z

[vllm] Roadmap 2026 Q2 vLLM × RL Roadmap This tracks the Q2 2026 vLLM-side work needed to make RL workloads training & rollout first-class. Each item links its… # [Roadmap] 2026 Q2 vLLM × RL Roadmap This tracks the Q2 2026 vLLM-side work needed to make RL workloads (training & rollout) first-class. Each item links its own RFC / issue / PR — please discuss there, and use this thread for cross-cutting prioritization. ## Training-Inference Consistency - [ ] Support R3 routing replay: #39701 - [ ] Fix logprobs / logits surface consistency: #37737 ## Runtime State Switching - [ ] Standardize weight sync lifecycle: #31848 - [ ] Make pause / resume coordinator-safe: #32103 - [ ] NCCL context offload / resume ## Rollout Performance / Efficiency - [ ] Improve KV cache / prefix reuse: #40244 - [ ] Stabilize P/D rollout throughput: [verl-project/verl#6243](https://github.com/verl-project/verl/pull/6243) - [ ] Add phase-aware performance modes to dynamically switch between throughput-optimized and latency-optimized configs - [ ] More mature FP8 W8A8 KV-cache rollout support - [ ] RDMA-based cross-cluster transport for vLLM intermediate results — vLLM internally produces large per-request artifacts (expert routing indices are one example, but it generalizes — any large per-step or per-layer signal a downstream RL system might need). We need a generic plugin mechanism to export these artifacts. Doing it over RDMA in a distributed fashion (peer-to-peer between vLLM workers and downstream consumers) instead of going through a host-side aggregation buffer scales much better when multiple nodes pull at once ## Framework & Workload Enablement - [ ] Stabilize RL framework serving contract: verl-project/verl#5737 - [ ] Support multimodal RL: verl-project/verl#5916 - [ ] Support teacher / OPD server pluggability: verl-project/verl#5897 ## Misc - [ ] Add rollout liveness and debug signals: #38147 - [ ] Publish more up-to-date vLLM Docker for RL Framework ## Sync / Discussion - Slack channel: `#sig-reinforcement-learning` on the vLLM Slack workspace - Weekly call: every Friday 06:30 Beijing / 15:30 Pacific (Thursday) PDT / 18:30 Eastern (Thursday) EDT — https://meet.google.com/hpi-znch-gcx?hs=224 - New contributors / observers welcome — leave a comment on this thread or ping in Slack to be added to the agenda

TL;DR

Review and prioritize the listed tasks to ensure a cohesive roadmap for integrating vLLM with RL workloads.

Guidance

Focus on addressing the training-inference consistency issues, such as supporting R3 routing replay and fixing logprobs/logits surface consistency.
Prioritize standardizing weight sync lifecycle and making pause/resume coordinator-safe to improve runtime state switching.
Explore improving rollout performance/efficiency by stabilizing P/D rollout throughput and adding phase-aware performance modes.
Engage with the community through the specified Slack channel and weekly calls to discuss progress and prioritize tasks.

Notes

This issue appears to be a high-level roadmap, and the provided information does not allow for a specific technical fix or workaround. The guidance provided is based on the assumption that the tasks listed are essential to the integration of vLLM with RL workloads.

Recommendation

Apply workaround: Prioritize and address the listed tasks to ensure a cohesive roadmap, as this will likely lead to a more efficient and effective integration of vLLM with RL workloads.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - 💡(How to fix) Fix [Roadmap] 2026 Q2 vLLM × RL Roadmap [9 comments, 6 participants]

Recommended Tools

GitHub issue graph ai analysis

[Roadmap] 2026 Q2 vLLM × RL Roadmap

Training-Inference Consistency

Runtime State Switching

Rollout Performance / Efficiency

Framework & Workload Enablement

Misc

Sync / Discussion

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

TRENDING

vllm - 💡(How to fix) Fix [Roadmap] 2026 Q2 vLLM × RL Roadmap [9 comments, 6 participants]

Recommended Tools

GitHub issue graph ai analysis

[Roadmap] 2026 Q2 vLLM × RL Roadmap

Training-Inference Consistency

Runtime State Switching

Rollout Performance / Efficiency

Framework & Workload Enablement

Misc

Sync / Discussion

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING