vllm - ✅(Solved) Fix [CI Failure]: Kernels FusedMoE Layer Test (2 B200s) has been broken since it was added on 04/06/25 [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#39525Fetched 2026-04-11 06:13:00
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Author
Participants
Assignees
Timeline (top)
added_to_project_v2 ×1assigned ×1commented ×1labeled ×1

Error Message

1775802312805 /usr/local/lib/python3.12/dist-packages/flashinfer/data/include/flashinfer/trtllm/common/cudaUtils.h:25:10: fatal error: cublasLt.h: No such file or directory 1775802312805 25 | #include <cublasLt.h> 1775802312805 | ^~~~~~~~~~~~

Root Cause

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

PR fix notes

PR #39824: [Bugfix] Attempt to fix b200 moe layer test

Description (problem / solution / changelog)

Purpose

Change libcublas to libcublas-dev in the Dockerfile in the hope that it will install cublasLt.h in a place where it is visible to flashinfer.

See https://github.com/vllm-project/vllm/issues/39525

Test Plan

Try it in CI.

Test Result


<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
</details>

Changed files

  • docker/Dockerfile (modified, +17/-16)
  • docker/versions.json (modified, +1/-1)
  • requirements/cuda.txt (modified, +2/-2)

PR #40057: [Bugfix] Temporarily disable B200 fp4 MoE layer tests

Description (problem / solution / changelog)

Purpose

Disable the modelopt_fp4 tests on B200 for now. To fix the underlying issue, the Dockerfile can be updated to install libcublas-dev instead of libcublas or we can wait for a newer version of flashinfer, e.g. 0.6.8rc1.

See https://github.com/vllm-project/vllm/issues/39525

Test Plan

Run B200 MoE layer test

Test Result


<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
</details>

Changed files

  • tests/kernels/moe/test_moe_layer.py (modified, +8/-0)

Code Example

1775802312805 /usr/local/lib/python3.12/dist-packages/flashinfer/data/include/flashinfer/trtllm/common/cudaUtils.h:25:10: fatal error: cublasLt.h: No such file or directory
1775802312805 25 | #include <cublasLt.h>
1775802312805  |          ^~~~~~~~~~~~
RAW_BUFFERClick to expand / collapse

Name of failing test

kernels/moe/test_moe_layer.py

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

Looking at https://vllm-ci-dashboard.vercel.app/jobs, the Kernels FusedMoE Layer Test (2 B200s) seems to have been red since it was added on 04/06/25. The failure appears to be caused by an FI import.

1775802312805 /usr/local/lib/python3.12/dist-packages/flashinfer/data/include/flashinfer/trtllm/common/cudaUtils.h:25:10: fatal error: cublasLt.h: No such file or directory
1775802312805 25 | #include <cublasLt.h>
1775802312805  |          ^~~~~~~~~~~~

📝 History of failing test

AFAICT, this group has never passed.

CC List.

No response

extent analysis

TL;DR

The most likely fix is to ensure the cublasLt.h header file is properly installed and accessible in the environment where the test is running.

Guidance

  • Verify that the CUDA toolkit is correctly installed and configured, as cublasLt.h is a part of it.
  • Check the include path for the compiler to ensure it includes the directory where cublasLt.h is located.
  • Confirm that the version of the CUDA toolkit matches the requirements of the transformers library and other dependencies.
  • Review the environment variables and dependencies required for the flashinfer library to ensure they are correctly set up.

Example

No specific code snippet can be provided without more context, but ensuring the CUDA toolkit is installed and configured correctly might involve commands like nvcc --version to check the CUDA compiler version and dpkg -l | grep cuda to list installed CUDA packages on a Debian-based system.

Notes

The solution may vary depending on the specific environment (e.g., Docker, local machine, CI/CD pipeline) and the versions of the libraries and tools involved. The cublasLt.h file is part of the CUDA toolkit, so resolving this issue will likely involve ensuring that the CUDA toolkit is properly installed and configured.

Recommendation

Apply workaround: Ensure the CUDA toolkit is correctly installed and configured, as this seems to be the root cause of the missing cublasLt.h header file.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - ✅(Solved) Fix [CI Failure]: Kernels FusedMoE Layer Test (2 B200s) has been broken since it was added on 04/06/25 [2 pull requests, 1 comments, 2 participants]