vllm - 💡(How to fix) Fix [CI Failure]: mi355_2: LM Eval Qwen3-5 Models (B200-MI355) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#40524Fetched 2026-04-22 07:44:08
View on GitHub
Comments
1
Participants
2
Timeline
14
Reactions
0
Timeline (top)
mentioned ×3subscribed ×3added_to_project_v2 ×2labeled ×2
RAW_BUFFERClick to expand / collapse

AMD nightly — failing test group

Group: mi355_2: LM Eval Qwen3-5 Models (B200-MI355)

Current streak start: 2026-04-21 First failure in 60d window: 2026-04-09 Last successful nightly: 2026-04-20 Break frequency (60d, pass↔fail flips): 5 Latest nightly date: 2026-04-21 Latest build(s): amd-ci #7854

Hardware status in latest nightly

hardwarestatus
mi355_2fail

Auto-managed by sync_ready_tickets.py. Closed + moved to Done when this group passes on all AMD hardware.

Last sync: https://github.com/AndreasKaratzas/vllm-ci-dashboard/actions/runs/24731404470

extent analysis

TL;DR

Investigate the mi355_2 hardware status and recent changes to identify the cause of the failing test group.

Guidance

  • Review the build logs from amd-ci #7854 to identify any errors or warnings related to the mi355_2 hardware.
  • Check the sync_ready_tickets.py script for any recent changes that may be contributing to the test group failures.
  • Verify that the mi355_2 hardware is properly configured and functioning correctly.
  • Investigate the 5 pass-fail flips in the 60-day window to identify any patterns or correlations with other changes or events.

Notes

The issue lacks detailed information about the test group and the errors encountered, so a thorough investigation of the build logs and hardware status is necessary to determine the root cause.

Recommendation

Apply workaround: Investigate and address the mi355_2 hardware issues, as it is the most likely cause of the failing test group, and then re-run the test to verify the fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - 💡(How to fix) Fix [CI Failure]: mi355_2: LM Eval Qwen3-5 Models (B200-MI355) [1 comments, 2 participants]