vllm - 💡(How to fix) Fix [Installation]: Documented v0.18.0 cu128 release wheel URL returns 404 [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#37847Fetched 2026-04-08 01:17:41
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
1
Timeline (top)
commented ×1labeled ×1
RAW_BUFFERClick to expand / collapse

Your current environment

I followed the official v0.18.0 GPU installation docs for pre-built wheels.

The docs say that vLLM provides binaries compiled with CUDA 12.8 / 12.9 / 13.0, and give the following URL pattern for release wheels:

vllm-${VLLM_VERSION}+cu${CUDA_VERSION}-cp38-abi3-manylinux_2_35_${CPU_ARCH}.whl

For v0.18.0 on x86_64 with CUDA 12.8, this resolves to:

https://github.com/vllm-project/vllm/releases/download/v0.18.0/vllm-0.18.0+cu128-cp38-abi3-manylinux_2_35_x86_64.whl

However, this URL returns HTTP 404.

Command used

uv pip install \
  "https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu128-cp38-abi3-manylinux_2_35_${CPU_ARCH}.whl" \
  --extra-index-url https://download.pytorch.org/whl/cu128

<img width="1367" height="212" alt="Image" src="https://github.com/user-attachments/assets/f3520447-bcc0-4ea6-a8fa-2f18aac84552" />

### How you are installing vllm

uv pip install \
>   "https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu128-cp38-abi3-manylinux_2_35_${CPU_ARCH}.whl" \
>   --extra-index-url https://download.pytorch.org/whl/cu128

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

extent analysis

Fix Plan

The fix involves installing the correct version of the vllm package.

  • Check the available releases on the vllm-project GitHub page for the correct wheel URL.
  • Update the VLLM_VERSION variable to the latest version available.
  • Use the correct CUDA version compatible with your system.

Example Code

# Set the correct VLLM version and CUDA version
VLLM_VERSION=0.18.0
CUDA_VERSION=12.8
CPU_ARCH=x86_64

# Install the vllm package using pip
pip install \
  "https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu${CUDA_VERSION##*.}-cp38-abi3-manylinux_2_35_${CPU_ARCH}.whl" \
  --extra-index-url https://download.pytorch.org/whl/cu${CUDA_VERSION##*}

Note: Replace VLLM_VERSION, CUDA_VERSION, and CPU_ARCH with the correct values for your system.

Verification

After running the installation command, verify that the vllm package is installed correctly by checking the package list:

pip list vllm

This should display the installed version of the vllm package.

Extra Tips

  • Make sure to check the compatibility of the CUDA version with your system.
  • If you encounter any issues during installation, try updating pip and setuptools to the latest versions:
pip install --upgrade pip setuptools

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - 💡(How to fix) Fix [Installation]: Documented v0.18.0 cu128 release wheel URL returns 404 [1 comments, 2 participants]