vllm - ✅(Solved) Fix [Feature]: Publish CPU images compatible with GitHub Actions [2 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#36898Fetched 2026-04-08 00:43:44
View on GitHub
Comments
1
Participants
1
Timeline
4
Reactions
1
Participants
Timeline (top)
closed ×1commented ×1cross-referenced ×1labeled ×1

Fix Action

Fixed

PR fix notes

PR #36901: [CI] Add AVX2-only CPU image build for GitHub Actions compatibility

Description (problem / solution / changelog)

Purpose

GitHub Actions runners use Azure VMs that may land on AMD EPYC Zen 1-3 CPUs without AVX-512 support, causing SIGILL crashes with the standard CPU image. This adds a second CPU image variant built with VLLM_CPU_DISABLE_AVX512=true and VLLM_CPU_AVX2=1, published alongside the existing CPU image without changing it.

Adds:

  • CI build script and step for AVX2-only CPU image ({commit}-cpu-avx2)
  • Release pipeline wheel build and image build steps (behind block approval)
  • Docker Hub publish commands for vllm/vllm-openai-cpu:*-avx2 tags

Fixes #36898

Test Plan

I've done these builds in a different PR (https://github.com/opendatahub-io/llama-stack-distribution/pull/203) but via a different methodology outside of the vLLM upstream build system - so I'm not sure if this has been done 100%, but I am happy to try and test it with whatever I can do within my control


<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
</details>

Changed files

  • .buildkite/image_build/image_build.yaml (modified, +14/-0)
  • .buildkite/image_build/image_build_cpu_avx2.sh (added, +41/-0)
  • .buildkite/release-pipeline.yaml (modified, +32/-1)
  • .buildkite/scripts/annotate-release.sh (modified, +9/-0)
RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

The prebuilt vLLM CPU Docker images published on Docker Hub (vllm/vllm-openai-cpu) are compiled with AVX-512 instructions. GitHub Actions runners run on Azure VMs that are non-deterministically assigned to either Intel or AMD hardware. AMD EPYC Zen 1–3 processors (used in Azure's Dasv5/Dadsv5 VM series) do not support AVX-512, so when a runner lands on one of these hosts, vLLM crashes immediately with SIGILL (exit code 132) — an illegal instruction signal.

There is no runs-on label or configuration option in GitHub Actions to pin a runner to a specific CPU microarchitecture. The AVX-512 availability is effectively random per job run, making CI results non-deterministic for this usecase.

By modifying the existing build or producing a second CPU image (vllm/vllm-openai-cpu-avx2 or something of the like) users wanting to run vLLM CPU images in GitHub Actions could do so without having to build their own custom images.

Alternatives

You can build the container from scratch yourself, where vLLM is complied from source to run on CPU targeting AVX2, such as I did here: https://github.com/opendatahub-io/llama-stack-distribution/pull/203

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

Fix Plan

To resolve the issue, we need to create a Docker image that supports AVX2 instructions, which are compatible with AMD EPYC Zen 1-3 processors.

  • Create a new Dockerfile that compiles vLLM with AVX2 instructions.
  • Build the new Docker image using the Dockerfile.
  • Push the new image to Docker Hub.

Example Dockerfile:

# Use an official Ubuntu image as the base
FROM ubuntu:20.04

# Install dependencies
RUN apt update && apt install -y build-essential

# Compile vLLM with AVX2 instructions
RUN git clone https://github.com/vllm/vllm.git && \
    cd vllm && \
    mkdir build && \
    cd build && \
    cmake -DCMAKE_CXX_FLAGS="-march=znver2 -mavx2" .. && \
    make

# Create a new Docker image

Verification

To verify that the fix worked, you can run the new Docker image on a GitHub Actions runner and check that it doesn't crash with a SIGILL signal.

Example command:

docker run -it vllm/vllm-openai-cpu-avx2

Extra Tips

  • Make sure to test the new image on different CPU architectures to ensure compatibility.
  • Consider adding a check in the Dockerfile to ensure that the compiled vLLM binary is compatible with the host CPU architecture.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

vllm - ✅(Solved) Fix [Feature]: Publish CPU images compatible with GitHub Actions [2 pull requests, 1 comments, 1 participants]