ollama - 💡(How to fix) Fix AMD RDNA4 (gfx1201 / RX 9060 XT) not detected – Ollama falls back to CPU (0 VRAM) [33 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#14927Fetched 2026-04-08 00:57:30
View on GitHub
Comments
33
Participants
4
Timeline
53
Reactions
1
Timeline (top)
commented ×33subscribed ×10mentioned ×5unsubscribed ×2

Code Example

/opt/rocm/bin/rocminfo | grep Name
Name: gfx1201
Marketing Name: AMD Radeon RX 9060 XT

---

vulkaninfo | grep deviceName
deviceName = AMD Radeon RX 9060 XT (RADV GFX1200)

---

inference compute id=cpu library=cpu
total_vram="0 B"
discovering available GPUs...

---

~
❯ journalctl -u ollama
Mar 17 02:48:47 archlinux systemd[1]: Started Ollama Service.
Mar 17 02:48:47 archlinux ollama[34385]: Couldn't find '/var/lib/ollama/.ollama/id_ed25519'. Generating new private key.
Mar 17 02:48:47 archlinux ollama[34385]: Your new public key is:
Mar 17 02:48:47 archlinux ollama[34385]: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIO1H1riqQ79f9zQ0je0WsjYQVNwrWixOtF+RewdoBLF4
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=routes.go:1658 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HI>
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=routes.go:1660 msg="Ollama cloud disabled: false"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=images.go:477 msg="total blobs: 0"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=routes.go:1713 msg="Listening on 127.0.0.1:11434 (version 0.17.7)"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.867+06:00 level=INFO source=server.go:430 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 39375"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.881+06:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver=>
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.881+06:00 level=INFO source=routes.go:1763 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096
Mar 17 02:49:09 archlinux ollama[34385]: [GIN] 2026/03/17 - 02:49:09 | 404 |    3.011987ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
Mar 17 02:49:09 archlinux ollama[34385]: [GIN] 2026/03/17 - 02:49:09 | 404 |    2.930571ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
Mar 17 02:49:16 archlinux ollama[34385]: [GIN] 2026/03/17 - 02:49:16 | 200 |      23.265µs |       127.0.0.1 | HEAD     "/"
Mar 17 02:49:16 archlinux ollama[34385]: [GIN] 2026/03/17 - 02:49:16 | 404 |      74.924µs |       127.0.0.1 | POST     "/api/show"
Mar 17 02:49:18 archlinux ollama[34385]: time=2026-03-17T02:49:18.525+06:00 level=INFO source=download.go:179 msg="downloading ac9bc7a69dab in 16 561 MB part(s)"
Mar 17 03:02:19 archlinux ollama[34385]: time=2026-03-17T03:02:19.406+06:00 level=INFO source=download.go:179 msg="downloading 66b9ea09bd5b in 1 68 B part(s)"
Mar 17 03:02:21 archlinux ollama[34385]: time=2026-03-17T03:02:21.932+06:00 level=INFO source=download.go:179 msg="downloading 1e65450c3067 in 1 1.6 KB part(s)"
Mar 17 03:02:23 archlinux ollama[34385]: time=2026-03-17T03:02:23.653+06:00 level=INFO source=download.go:179 msg="downloading 832dd9e00a68 in 1 11 KB part(s)"
Mar 17 03:02:25 archlinux ollama[34385]: time=2026-03-17T03:02:25.447+06:00 level=INFO source=download.go:179 msg="downloading 0578f229f23a in 1 488 B part(s)"
Mar 17 03:02:30 archlinux ollama[34385]: [GIN] 2026/03/17 - 03:02:30 | 200 |        13m13s |       127.0.0.1 | POST     "/api/pull"
Mar 17 03:02:30 archlinux ollama[34385]: [GIN] 2026/03/17 - 03:02:30 | 200 |   48.775831ms |       127.0.0.1 | POST     "/api/show"
Mar 17 03:02:30 archlinux ollama[34385]: [GIN] 2026/03/17 - 03:02:30 | 200 |   48.776051ms |       127.0.0.1 | POST     "/api/show"
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: loaded meta data with 34 key-value pairs and 579 tensors from /var/lib/ollama/blobs/sha256-ac9bc7a69dab38da1c790838955f1293420b55ab555ef6b4615efa1>
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   0:                       general.architecture str              = qwen2
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   1:                               general.type str              = model
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   2:                               general.name str              = Qwen2.5 Coder 14B Instruct
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   3:                           general.finetune str              = Instruct
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   4:                           general.basename str              = Qwen2.5-Coder
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   5:                         general.size_label str              = 14B
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   6:                            general.license str              = apache-2.0
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   7:                       general.license.link str              = https://huggingface.co/Qwen/Qwen2.5-C...
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   8:                   general.base_model.count u32              = 1
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   9:                  general.base_model.0.name str              = Qwen2.5 Coder 14B
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  10:          general.base_model.0.organization str              = Qwen
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  11:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen2.5-C...
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  12:                               general.tags arr[str,6]       = ["code", "codeqwen", "chat", "qwen", ...
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  13:                          general.languages arr[str,1]       = ["en"]
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  14:                          qwen2.block_count u32              = 48
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  15:                       qwen2.context_length u32              = 32768
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  16:                     qwen2.embedding_length u32              = 5120
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  17:                  qwen2.feed_forward_length u32              = 13824
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  18:                 qwen2.attention.head_count u32              = 40
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  19:              qwen2.attention.head_count_kv u32              = 8
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  20:                       qwen2.rope.freq_base f32              = 1000000.000000
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  21:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000001
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  22:                          general.file_type u32              = 15
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  23:                       tokenizer.ggml.model str              = gpt2
lines 1-51
RAW_BUFFERClick to expand / collapse

What is the issue?

Running Ollama on Arch Linux with an AMD RX 9060 XT (gfx1201 / RDNA4). The system correctly detects the GPU via ROCm and Vulkan, but Ollama does not use it for inference and falls back to CPU.


🔍 Relevant Debug Info

ROCm detection:

/opt/rocm/bin/rocminfo | grep Name
Name: gfx1201
Marketing Name: AMD Radeon RX 9060 XT

Vulkan detection:

vulkaninfo | grep deviceName
deviceName = AMD Radeon RX 9060 XT (RADV GFX1200)

Ollama logs:

inference compute id=cpu library=cpu
total_vram="0 B"
discovering available GPUs...

⚙️ Environment Details

  • OS: Arch Linux
  • CPU: AMD Ryzen 5 9600X
  • GPU: AMD Radeon RX 9060 XT (gfx1201 / RDNA4)
  • Ollama version: 0.17.7
  • ROCm installed and functional
  • Vulkan (RADV) working

❗ Issue

Despite proper GPU detection at the system level, Ollama does not initialize or utilize the GPU backend. No VRAM is detected, and inference defaults to CPU.


🔁 Notes / Attempts

  • Verified ROCm and Vulkan functionality independently
  • Restarted Ollama service after environment changes
  • Attempted forcing GPU/Vulkan via environment variables (no effect)

❓ Question

Is RDNA4 (gfx12 / gfx1201) currently unsupported? Are there experimental builds or flags to enable GPU acceleration on newer AMD architectures?

Relevant log output

~
❯ journalctl -u ollama
Mar 17 02:48:47 archlinux systemd[1]: Started Ollama Service.
Mar 17 02:48:47 archlinux ollama[34385]: Couldn't find '/var/lib/ollama/.ollama/id_ed25519'. Generating new private key.
Mar 17 02:48:47 archlinux ollama[34385]: Your new public key is:
Mar 17 02:48:47 archlinux ollama[34385]: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIO1H1riqQ79f9zQ0je0WsjYQVNwrWixOtF+RewdoBLF4
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=routes.go:1658 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HI>
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=routes.go:1660 msg="Ollama cloud disabled: false"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=images.go:477 msg="total blobs: 0"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=routes.go:1713 msg="Listening on 127.0.0.1:11434 (version 0.17.7)"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.866+06:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.867+06:00 level=INFO source=server.go:430 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 39375"
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.881+06:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver=>
Mar 17 02:48:47 archlinux ollama[34385]: time=2026-03-17T02:48:47.881+06:00 level=INFO source=routes.go:1763 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096
Mar 17 02:49:09 archlinux ollama[34385]: [GIN] 2026/03/17 - 02:49:09 | 404 |    3.011987ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
Mar 17 02:49:09 archlinux ollama[34385]: [GIN] 2026/03/17 - 02:49:09 | 404 |    2.930571ms |       127.0.0.1 | POST     "/v1/messages?beta=true"
Mar 17 02:49:16 archlinux ollama[34385]: [GIN] 2026/03/17 - 02:49:16 | 200 |      23.265µs |       127.0.0.1 | HEAD     "/"
Mar 17 02:49:16 archlinux ollama[34385]: [GIN] 2026/03/17 - 02:49:16 | 404 |      74.924µs |       127.0.0.1 | POST     "/api/show"
Mar 17 02:49:18 archlinux ollama[34385]: time=2026-03-17T02:49:18.525+06:00 level=INFO source=download.go:179 msg="downloading ac9bc7a69dab in 16 561 MB part(s)"
Mar 17 03:02:19 archlinux ollama[34385]: time=2026-03-17T03:02:19.406+06:00 level=INFO source=download.go:179 msg="downloading 66b9ea09bd5b in 1 68 B part(s)"
Mar 17 03:02:21 archlinux ollama[34385]: time=2026-03-17T03:02:21.932+06:00 level=INFO source=download.go:179 msg="downloading 1e65450c3067 in 1 1.6 KB part(s)"
Mar 17 03:02:23 archlinux ollama[34385]: time=2026-03-17T03:02:23.653+06:00 level=INFO source=download.go:179 msg="downloading 832dd9e00a68 in 1 11 KB part(s)"
Mar 17 03:02:25 archlinux ollama[34385]: time=2026-03-17T03:02:25.447+06:00 level=INFO source=download.go:179 msg="downloading 0578f229f23a in 1 488 B part(s)"
Mar 17 03:02:30 archlinux ollama[34385]: [GIN] 2026/03/17 - 03:02:30 | 200 |        13m13s |       127.0.0.1 | POST     "/api/pull"
Mar 17 03:02:30 archlinux ollama[34385]: [GIN] 2026/03/17 - 03:02:30 | 200 |   48.775831ms |       127.0.0.1 | POST     "/api/show"
Mar 17 03:02:30 archlinux ollama[34385]: [GIN] 2026/03/17 - 03:02:30 | 200 |   48.776051ms |       127.0.0.1 | POST     "/api/show"
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: loaded meta data with 34 key-value pairs and 579 tensors from /var/lib/ollama/blobs/sha256-ac9bc7a69dab38da1c790838955f1293420b55ab555ef6b4615efa1>
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   0:                       general.architecture str              = qwen2
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   1:                               general.type str              = model
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   2:                               general.name str              = Qwen2.5 Coder 14B Instruct
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   3:                           general.finetune str              = Instruct
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   4:                           general.basename str              = Qwen2.5-Coder
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   5:                         general.size_label str              = 14B
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   6:                            general.license str              = apache-2.0
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   7:                       general.license.link str              = https://huggingface.co/Qwen/Qwen2.5-C...
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   8:                   general.base_model.count u32              = 1
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv   9:                  general.base_model.0.name str              = Qwen2.5 Coder 14B
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  10:          general.base_model.0.organization str              = Qwen
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  11:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen2.5-C...
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  12:                               general.tags arr[str,6]       = ["code", "codeqwen", "chat", "qwen", ...
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  13:                          general.languages arr[str,1]       = ["en"]
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  14:                          qwen2.block_count u32              = 48
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  15:                       qwen2.context_length u32              = 32768
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  16:                     qwen2.embedding_length u32              = 5120
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  17:                  qwen2.feed_forward_length u32              = 13824
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  18:                 qwen2.attention.head_count u32              = 40
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  19:              qwen2.attention.head_count_kv u32              = 8
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  20:                       qwen2.rope.freq_base f32              = 1000000.000000
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  21:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000001
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  22:                          general.file_type u32              = 15
Mar 17 03:02:30 archlinux ollama[34385]: llama_model_loader: - kv  23:                       tokenizer.ggml.model str              = gpt2
lines 1-51

OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.17.7

extent analysis

Fix Plan

To fix the issue of Ollama not utilizing the AMD Radeon RX 9060 XT GPU for inference, we need to ensure that the GPU is properly detected and configured. Since the system correctly detects the GPU via ROCm and Vulkan, the issue might be related to Ollama's configuration or compatibility with the RDNA4 architecture.

Here are the steps to try:

  • Environment Variables: Set the HIML_ENABLE_VK environment variable to 1 to enable Vulkan support. You can do this by running the command export HIML_ENABLE_VK=1 before starting Ollama.
  • ROCm Configuration: Ensure that the ROCm installation is correct and the ROCm_PATH environment variable is set. You can check the ROCm installation by running the command rocminfo.
  • Ollama Configuration: Check the Ollama configuration file (usually located at /etc/ollama/config.json) and ensure that the gpu section is configured to use the Vulkan backend. You can add the following lines to the configuration file:
"gpu": {
  "backend": "vulkan",
  "device": "gfx1201"
}
  • Experimental Builds: If the above steps do not work, you can try using an experimental build of Ollama that supports the RDNA4 architecture. You can check the Ollama GitHub repository for experimental builds or contact the developers for more information.

Code Changes

No code changes are required for this fix. However, if you need to modify the Ollama configuration file, you can use the following example:

{
  "gpu": {
    "backend": "vulkan",
    "device": "gfx1201"
  }
}

Verification

To verify that the fix worked, you can check the Ollama logs for any errors or warnings related to GPU detection or configuration. You can also use the rocminfo command to check if the GPU is being utilized by Ollama.

Extra Tips

  • Ensure that the ROCm installation is up-to-date and compatible with the RDNA4 architecture.
  • Check the Ollama documentation for any specific requirements or configurations for using the Vulkan backend.
  • If you are using a custom Ollama build, ensure that it is compiled with the correct flags to support the RDNA4 architecture.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix AMD RDNA4 (gfx1201 / RX 9060 XT) not detected – Ollama falls back to CPU (0 VRAM) [33 comments, 4 participants]