vllm - ✅(Solved) Fix [Feature]: Publish CPU images compatible with GitHub Actions [2 pull requests, 1 comments, 1 participants]

nathan-weinberg · 2026-03-12T15:57:21Z

[vllm] PR 36901: CI Add AVX2-only CPU image build for GitHub Actions compatibility - Repository: vllm-project/vllm - Author: nathan-weinberg - State: closed |… # PR #36901: [CI] Add AVX2-only CPU image build for GitHub Actions compatibility - Repository: vllm-project/vllm - Author: nathan-weinberg - State: closed | merged: False - Link: https://github.com/vllm-project/vllm/pull/36901 ## Description (problem / solution / changelog) ## Purpose GitHub Actions runners use Azure VMs that may land on AMD EPYC Zen 1-3 CPUs without AVX-512 support, causing SIGILL crashes with the standard CPU image. This adds a second CPU image variant built with VLLM_CPU_DISABLE_AVX512=true and VLLM_CPU_AVX2=1, published alongside the existing CPU image without changing it. Adds: - CI build script and step for AVX2-only CPU image ({commit}-cpu-avx2) - Release pipeline wheel build and image build steps (behind block approval) - Docker Hub publish commands for vllm/vllm-openai-cpu:*-avx2 tags Fixes #36898 ## Test Plan I've done these builds in a different PR (https://github.com/opendatahub-io/llama-stack-distribution/pull/203) but via a different methodology outside of the vLLM upstream build system - so I'm not sure if this has been done 100%, but I am happy to try and test it with whatever I can do within my control --- Essential Elements of an Effective PR Description Checklist - [ ] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [ ] The test plan, such as providing test command. - [ ] The test results, such as pasting the results comparison before and after, or e2e results - [ ] (Optional) The necessary documentation update, such as updating `supported_models.md` and `examples` for a new model. - [ ] (Optional) Release notes update. If your change is user facing, please update the release notes draft in the [Google Doc](https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0). ## Changed files - `.buildkite/image_build/image_build.yaml` (modified, +14/-0) - `.buildkite/image_build/image_build_cpu_avx2.sh` (added, +41/-0) - `.buildkite/release-pipeline.yaml` (modified, +32/-1) - `.buildkite/scripts/annotate-release.sh` (modified, +9/-0) ## Fixed - Fixed by PR: [CI] Add AVX2-only CPU image build for GitHub Actions compatibility (https://github.com/vllm-project/vllm/pull/36901) ### 🚀 The feature, motivation and pitch The prebuilt vLLM CPU Docker images published on Docker Hub (vllm/vllm-openai-cpu) are compiled with AVX-512 instructions. GitHub Actions runners run on Azure VMs that are non-deterministically assigned to either Intel or AMD hardware. AMD EPYC Zen 1–3 processors (used in Azure's Dasv5/Dadsv5 VM series) do not support AVX-512, so when a runner lands on one of these hosts, vLLM crashes immediately with SIGILL (exit code 132) — an illegal instruction signal. There is no `runs-on` label or configuration option in GitHub Actions to pin a runner to a specific CPU microarchitecture. The AVX-512 availability is effectively random per job run, making CI results non-deterministic for this usecase. By modifying the existing build or producing a second CPU image (`vllm/vllm-openai-cpu-avx2` or something of the like) users wanting to run vLLM CPU images in GitHub Actions could do so without having to build their own custom images. ### Alternatives You can build the container from scratch yourself, where vLLM is complied from source to run on CPU targeting AVX2, such as I did here: https://github.com/opendatahub-io/llama-stack-distribution/pull/203 ### Additional context _No response_ ### Before submitting a new issue... - [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

vllm2026-03-12 15:57:21

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#36898•Fetched 2026-04-08 00:43:44

View on GitHub

Comments

Participants

Timeline

Reactions

Author

nathan-weinberg

Participants

nathan-weinberg

Timeline (top)

closed ×1commented ×1cross-referenced ×1labeled ×1

Fix Action

Fixed

Fixed by PR: [CI] Add AVX2-only CPU image build for GitHub Actions compatibility (https://github.com/vllm-project/vllm/pull/36901)

PR fix notes

PR #36901: [CI] Add AVX2-only CPU image build for GitHub Actions compatibility

Repository: vllm-project/vllm
Author: nathan-weinberg
State: closed | merged: False
Link: https://github.com/vllm-project/vllm/pull/36901

Description (problem / solution / changelog)

Purpose

GitHub Actions runners use Azure VMs that may land on AMD EPYC Zen 1-3 CPUs without AVX-512 support, causing SIGILL crashes with the standard CPU image. This adds a second CPU image variant built with VLLM_CPU_DISABLE_AVX512=true and VLLM_CPU_AVX2=1, published alongside the existing CPU image without changing it.

Adds:

CI build script and step for AVX2-only CPU image ({commit}-cpu-avx2)
Release pipeline wheel build and image build steps (behind block approval)
Docker Hub publish commands for vllm/vllm-openai-cpu:*-avx2 tags

Fixes #36898

Test Plan

I've done these builds in a different PR (https://github.com/opendatahub-io/llama-stack-distribution/pull/203) but via a different methodology outside of the vLLM upstream build system - so I'm not sure if this has been done 100%, but I am happy to try and test it with whatever I can do within my control

<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

</details>

Changed files

.buildkite/image_build/image_build.yaml (modified, +14/-0)
.buildkite/image_build/image_build_cpu_avx2.sh (added, +41/-0)
.buildkite/release-pipeline.yaml (modified, +32/-1)
.buildkite/scripts/annotate-release.sh (modified, +9/-0)

RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

The prebuilt vLLM CPU Docker images published on Docker Hub (vllm/vllm-openai-cpu) are compiled with AVX-512 instructions. GitHub Actions runners run on Azure VMs that are non-deterministically assigned to either Intel or AMD hardware. AMD EPYC Zen 1–3 processors (used in Azure's Dasv5/Dadsv5 VM series) do not support AVX-512, so when a runner lands on one of these hosts, vLLM crashes immediately with SIGILL (exit code 132) — an illegal instruction signal.

There is no runs-on label or configuration option in GitHub Actions to pin a runner to a specific CPU microarchitecture. The AVX-512 availability is effectively random per job run, making CI results non-deterministic for this usecase.

By modifying the existing build or producing a second CPU image (vllm/vllm-openai-cpu-avx2 or something of the like) users wanting to run vLLM CPU images in GitHub Actions could do so without having to build their own custom images.

Alternatives

You can build the container from scratch yourself, where vLLM is complied from source to run on CPU targeting AVX2, such as I did here: https://github.com/opendatahub-io/llama-stack-distribution/pull/203

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

Fix Plan

To resolve the issue, we need to create a Docker image that supports AVX2 instructions, which are compatible with AMD EPYC Zen 1-3 processors.

Create a new Dockerfile that compiles vLLM with AVX2 instructions.
Build the new Docker image using the Dockerfile.
Push the new image to Docker Hub.

Example Dockerfile:

# Use an official Ubuntu image as the base
FROM ubuntu:20.04

# Install dependencies
RUN apt update && apt install -y build-essential

# Compile vLLM with AVX2 instructions
RUN git clone https://github.com/vllm/vllm.git && \
    cd vllm && \
    mkdir build && \
    cd build && \
    cmake -DCMAKE_CXX_FLAGS="-march=znver2 -mavx2" .. && \
    make

# Create a new Docker image

Verification

To verify that the fix worked, you can run the new Docker image on a GitHub Actions runner and check that it doesn't crash with a SIGILL signal.

Example command:

docker run -it vllm/vllm-openai-cpu-avx2

Extra Tips

Make sure to test the new image on different CPU architectures to ensure compatibility.
Consider adding a check in the Dockerfile to ensure that the compiled vLLM binary is compatible with the host CPU architecture.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - ✅(Solved) Fix [Feature]: Publish CPU images compatible with GitHub Actions [2 pull requests, 1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #36901: [CI] Add AVX2-only CPU image build for GitHub Actions compatibility

Description (problem / solution / changelog)

Purpose

Test Plan

Changed files

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

vllm - ✅(Solved) Fix [Feature]: Publish CPU images compatible with GitHub Actions [2 pull requests, 1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #36901: [CI] Add AVX2-only CPU image build for GitHub Actions compatibility

Description (problem / solution / changelog)

Purpose

Test Plan

Changed files

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING