ollama - ✅(Solved) Fix ROCm backend fails to initialize on AMD Radeon AI PRO R9700 (RDNA4, gfx1201) in Windows 11 [1 pull requests, 11 comments, 5 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#14686Fetched 2026-04-08 00:32:56
View on GitHub
Comments
11
Participants
5
Timeline
13
Reactions
0
Author
Timeline (top)
commented ×11cross-referenced ×1labeled ×1

Fix Action

Fixed

PR fix notes

PR #14979: Add missing hipBLASLt kernels and some gfx targets cleanup

Description (problem / solution / changelog)

ROCm build fixes: hipBLASLt support and GPU target cleanup

Changes

Add hipblaslt folder to install (CMakeLists.txt)

hipblaslt directory was not being copied to the ollama install folder during the HIP component install, even though it contains GPU kernels required for inference on RDNA3/RDNA4 and MI300 GPUs. Fixed by adding an explicit install(DIRECTORY ... hipblaslt ...) alongside the existing rocblas install rule.

Fix gfx906 kernel removal on Windows (build_windows.ps1, release.yaml)

The existing Remove-Item with wildcards in the path was silently failing on Windows. Replaced with Get-ChildItem -Filter | Remove-Item -Force which reliably removes the stale gfx906 rocblas kernels.

Remove unsupported GPU targets (CMakePresets.json, CMakeLists.txt)

  • Removed gfx950 (MI350X) — ggml has no CDNA4 architecture support; there are no MFMA or other MI350-specific code paths in ggml. Only gfx942 is recognized as CDNA3.
  • Removed gfx940 and gfx941 — these targets do not exist in ggml. See ROCm/ROCm#4825.
  • Excluded gfx942 (MI300X/MI300A) from Windows builds — AMD Instinct MI-series cards are not supported on Windows HIP runtime.

Fix Dockerfile cleanup (Dockerfile)

The rm -f dist/lib/ollama/rocm/rocblas/library/*gfx90[06]* line was targeting gfx900 and gfx906 kernels that no longer exist in ROCm 7.2.0. Updated to remove gfx950 kernels from both rocblas/library/ and hipblaslt/library/, which are present in the ROCm distribution but correspond to an unsupported GPU target.

Update GPU documentation (docs/gpu.mdx)

Removed gfx950 / MI350X from the list of supported Linux targets.

Changed files

  • .github/workflows/release.yaml (modified, +1/-1)
  • CMakeLists.txt (modified, +5/-3)
  • CMakePresets.json (modified, +2/-2)
  • Dockerfile (modified, +2/-1)
  • docs/gpu.mdx (modified, +1/-1)
  • scripts/build_windows.ps1 (modified, +1/-1)
RAW_BUFFERClick to expand / collapse

Description

I am using Ollama v0.17.7 on Windows 11 with an AMD Radeon AI PRO R9700 (RDNA4 architecture, gfx1201). The ROCm backend fails to initialize, falling back to CPU, even though the HIP SDK 7.1 is installed and the Vulkan backend works perfectly.

Environment

  • OS: Windows 11 Pro (10.0.26200)
  • Ollama version: 0.17.7 (installed in C:\Ollama)
  • GPU: AMD Radeon AI PRO R9700 (2x, 32GB GDDR6 each) – RDNA4, gfx1201
  • Driver: AMD Software Pro Edition 26.2.2 (driver date 2026/2/17)
  • HIP SDK: 7.1.0 installed (AMD HIP SDK components)
  • ROCm files used: ollama-windows-amd64-rocm.zip (extracted to C:\Ollama\lib\ollama\)

Expected BehaviorOllama should initialize the ROCm backend and utilize the GPU(s) for inference.

Actual BehaviorOllama detects the GPU (description="AMD Radeon AI PRO R9700" compute=gfx1201) but then logs filtering device which didn't fully initialize and falls back to CPU. The inference compute log shows only CPU.

Steps Already Taken

  1. Installed latest AMD driver and HIP SDK 7.1.
  2. Set HIP_VISIBLE_DEVICES=1 and OLLAMA_DEBUG=1.
  3. Replaced ollama.exe with the one from ollama-windows-amd64.zip and copied ROCm libraries from ollama-windows-amd64-rocm.zip to C:\Ollama\lib\ollama\.
  4. Tried HSA_OVERRIDE_GFX_VERSION=12.0.1, 12.0.0, 11.0.0 – all result in the same filtering.
  5. Vulkan backend (OLLAMA_VULKAN=1) works fine, utilizing both GPUs (see logs below).

Relevant Logs (from ollama serve with OLLAMA_DEBUG=1)

time=2026-03-07T09:49:00.849+08:00 level=DEBUG source=runner.go:153 msg="filtering device which didn't fully initialize" id=0 libdir=C:\Ollama\lib\ollama\rocm pci_id=0000:07:00.0 library=ROCm
time=2026-03-07T09:49:00.850+08:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu ...

The Vulkan backend successfully enumerates both GPUs:

ggml_vulkan: Found 3 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon AI PRO R9700 (AMD proprietary driver) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
...

AnalysisThe ROCm libraries bundled with Ollama (version 6.x, as indicated by amdhip64_6.dll) do not include precompiled kernels for gfx1201 (RDNA4). The TensileLibrary_* files from the ROCm 6.2 build path (as seen in the user's attached file list) contain kernels only up to gfx1151. Thus, even with HSA_OVERRIDE_GFX_VERSION, the HIP runtime cannot find suitable kernels for this architecture, leading to initialization failure.

RequestPlease update the Windows ROCm support in Ollama to include gfx1201 (RDNA4). This could be achieved by:

  • Bundling a newer version of ROCm (e.g., 7.x) that adds support for RDNA4.
  • Adding the necessary kernel builds for gfx1201 to the existing ROCm 6.x package.

Thank you for your work on Ollama! I'm happy to provide any additional information or testing if needed.


问题描述

我在 Windows 11 上使用 Ollama v0.17.7,显卡是 AMD Radeon AI PRO R9700(RDNA4 架构,gfx1201)。尽管已经安装了 HIP SDK 7.1,并且 Vulkan 后端可以完美运行,但 ROCm 后端始终初始化失败,最终回退到 CPU 运行。

环境信息

  • 操作系统:Windows 11 Pro (10.0.26200)
  • Ollama 版本:0.17.7(安装在 C:\Ollama
  • GPU:AMD Radeon AI PRO R9700(两张,各 32GB GDDR6)—— RDNA4,gfx1201
  • 驱动程序:AMD Software Pro Edition 26.2.2(驱动日期 2026/2/17)
  • HIP SDK:7.1.0(已安装所有组件)
  • 使用的 ROCm 文件:ollama-windows-amd64-rocm.zip(解压至 C:\Ollama\lib\ollama\

预期行为Ollama 应能初始化 ROCm 后端并正常使用 GPU 进行推理。

实际行为Ollama 检测到 GPU(日志显示 description="AMD Radeon AI PRO R9700" compute=gfx1201),但随后出现 filtering device which didn't fully initialize,最终只使用 CPU。推理日志显示 inference compute 仅为 CPU。

已尝试的解决方案

  1. 安装了最新的 AMD 驱动和 HIP SDK 7.1。
  2. 设置 HIP_VISIBLE_DEVICES=1OLLAMA_DEBUG=1
  3. ollama-windows-amd64.zip 中的 ollama.exe 替换原文件,并将 ollama-windows-amd64-rocm.zip 中的所有文件复制到 C:\Ollama\lib\ollama\
  4. 尝试了 HSA_OVERRIDE_GFX_VERSION=12.0.112.0.011.0.0,均得到相同的过滤结果。
  5. 启用 Vulkan 后端(OLLAMA_VULKAN=1)后,两张显卡均被成功识别并用于推理(见下方日志)。

相关日志ollama serve 并设置 OLLAMA_DEBUG=1 时的输出)

time=2026-03-07T09:49:00.849+08:00 level=DEBUG source=runner.go:153 msg="filtering device which didn't fully initialize" id=0 libdir=C:\Ollama\lib\ollama\rocm pci_id=0000:07:00.0 library=ROCm
time=2026-03-07T09:49:00.850+08:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu ...

Vulkan 后端成功枚举两张显卡:

ggml_vulkan: Found 3 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon AI PRO R9700 (AMD proprietary driver) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
...

问题分析Ollama 自带的 ROCm 库(版本 6.x,从 amdhip64_6.dll 可以看出)没有为 gfx1201(RDNA4)预编译内核。从用户提供的文件列表中,TensileLibrary_* 文件中仅包含截至 gfx1151 的内核。因此,即使设置了 HSA_OVERRIDE_GFX_VERSION,HIP 运行时也无法找到适合该架构的内核,导致初始化失败。

请求希望官方能在 Windows 版的 ROCm 支持中添加 gfx1201(RDNA4)。具体途径可以是:

  • 打包更新的 ROCm 版本(例如 7.x),其中包含对 RDNA4 的支持。
  • 或在现有的 ROCm 6.x 包中为 gfx1201 添加所需的内核编译。

感谢 Ollama 团队的辛勤工作!如果需要更多信息或测试,请随时告知。

extent analysis

Fix Plan

To resolve the issue of ROCm backend failing to initialize on AMD Radeon AI PRO R9700 (RDNA4 architecture), we need to update the ROCm support in Ollama to include gfx1201. Here are the steps:

  • Update ROCm Version: Bundle a newer version of ROCm (e.g., 7.x) that adds support for RDNA4.
  • Add Kernel Builds: Add the necessary kernel builds for gfx1201 to the existing ROCm 6.x package.

Code Changes

No specific code changes are required in this case, as the issue is related to the ROCm library version and kernel builds. However, the Ollama team may need to update their build process to include the newer ROCm version or add the required kernel builds.

Configuration Changes

No configuration changes are required, but the user may need to update their environment variables to point to the new ROCm installation.

Example Code Snippet

No example code snippet is required, as this is a library update issue. However, the Ollama team may need to update their CMakeLists.txt or build script to include the newer ROCm version:

# Update ROCm version
set(ROCM_VERSION 7.1.0)

Infra / Dependency Fixes

The Ollama team may need to update their dependencies to include the newer ROCm version. This can be done by updating the ollama-windows-amd64-rocm.zip package to include the required kernel builds for gfx1201.

Temporary Workarounds

If the user cannot wait for the official update, they may try to manually update the ROCm libraries and kernel builds. However, this is not recommended, as it may cause compatibility issues with other components of Ollama.

Verification

To verify that the fix worked, the user can run ollama serve with OLLAMA_DEBUG=1 and check the logs for successful initialization of the ROCm backend. The logs should indicate that the GPU is being used for inference:

time=2026-03-07T09:49:00.849+08:00 level=DEBUG source=runner.go:153 msg="initialized device" id=0 libdir=C:\Ollama\lib\ollama\rocm pci_id=0000:07:00.0 library=ROCm
time=2026-03-07T09:49:00.850+08:00 level=INFO source=types.go:60 msg="inference compute" id=gpu library=rocm ...

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING