vllm - 💡(How to fix) Fix [Feature]: ROCm Kimi K2.5 EAGLE3 MTP heads [1 comments, 2 participants]

vllm2026-04-02 21:42:20

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#38851•Fetched 2026-04-08 02:34:32

View on GitHub

Comments

Participants

Timeline

Reactions

Author

functionstackx

Participants

functionstackx

github-actions[bot]

Timeline (top)

mentioned ×7subscribed ×7labeled ×2added_to_project_v2 ×1

RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

hi @hongxiayang

+viz @powderluv @chunfangamd @andyluo7

spec decode isnt an common method widely used in production but unfortunately the kimi did not release their MTP heads. NVIDIA & production inference API endpoint providers like Baseten have trained their own MTP heads for kimi k2.5

nvidia has open sourced this https://huggingface.co/nvidia/Kimi-K2.5-Thinking-Eagle3

there is also this one trained on torchspec by the community https://huggingface.co/lightseekorg/kimi-k2.5-eagle3 . if this is the recommended mtp head architecture that amd chooses to support, please let me know.

AMD does not have their own eagle3 MTP heads open sourced? when should we expect AMD have production features like MTP for kimi k2.5?

https://huggingface.co/amd/models?search=kimi

Alternatives

not use spec decode and not bad perf per $

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

Consider using alternative MTP heads from NVIDIA or the community, such as those available on Hugging Face, as a workaround for the lack of AMD's open-sourced Eagle3 MTP heads for Kimi K2.5.

Guidance

Explore the NVIDIA open-sourced Kimi-K2.5-Thinking-Eagle3 model on Hugging Face as a potential alternative.
Investigate the community-trained model on torchspec, available at https://huggingface.co/lightseekorg/kimi-k2.5-eagle3, for possible use.
Evaluate the performance of these alternative models to determine their suitability for production use.
Check the AMD models page on Hugging Face (https://huggingface.co/amd/models?search=kimi) for any updates on available Kimi K2.5 models.

Notes

The availability and suitability of AMD's open-sourced Eagle3 MTP heads for Kimi K2.5 are uncertain, and using alternative models may be necessary.

Recommendation

Apply workaround: Use alternative MTP heads from NVIDIA or the community, as they are currently available and may provide a suitable solution for production use.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #tokenizer error #prompt formatting #chain error #conversation history

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - 💡(How to fix) Fix [Feature]: ROCm Kimi K2.5 EAGLE3 MTP heads [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

TRENDING

vllm - 💡(How to fix) Fix [Feature]: ROCm Kimi K2.5 EAGLE3 MTP heads [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING