ollama - 💡(How to fix) Fix CUDA_VISIBLE_DEVICES=-1 incorrectly suppresses ROCm GPU detection [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15752Fetched 2026-04-23 07:23:21
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
labeled ×1

Error Message

time=2026-04-22T20:37:54.241+02:00 level=WARN source=runner.go:485 msg="user overrode visible devices" CUDA_VISIBLE_DEVICES=-1 time=2026-04-22T20:37:54.242+02:00 level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again"

Code Example

inference compute ... library=CUDA ... name=CUDA0 description="NVIDIA GeForce RTX 5070 Laptop GPU"
   inference compute ... library=ROCm ... name=ROCm0 description="AMD Radeon 890M Graphics"

---

CUDA_VISIBLE_DEVICES=-1 ollama serve

---

galuszkak@Thunder:~$ ollama serve 
time=2026-04-22T20:37:41.008+02:00 level=INFO source=routes.go:1752 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/galuszkak/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
[...]
time=2026-04-22T20:37:42.087+02:00 level=INFO source=types.go:42 msg="inference compute" id=GPU-0eee6825-5aa8-a022-085a-6c135272070d filter_id="" library=CUDA compute=12.0 name=CUDA0 description="NVIDIA GeForce RTX 5070 Laptop GPU" libdirs=ollama,cuda_v13 driver=13.0 pci_id=0000:c1:00.0 type=discrete total="8.0 GiB" available="7.2 GiB"
time=2026-04-22T20:37:42.087+02:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=ROCm compute=gfx1150 name=ROCm0 description="AMD Radeon 890M Graphics" libdirs=ollama,rocm driver=70253.21 pci_id=0000:c2:00.0 type=iGPU total="35.6 GiB" available="35.4 GiB"
time=2026-04-22T20:37:42.087+02:00 level=INFO source=routes.go:1860 msg="vram-based default context" total_vram="43.6 GiB" default_num_ctx=32768


and here with CUDA_VISIBLE_DEVICES=-1:

galuszkak@Thunder:~$ CUDA_VISIBLE_DEVICES=-1 ollama serve 
time=2026-04-22T20:37:54.239+02:00 level=INFO source=routes.go:1752 msg="server config" env="map[CUDA_VISIBLE_DEVICES:-1 GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/galuszkak/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
[...]
time=2026-04-22T20:37:54.241+02:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-04-22T20:37:54.241+02:00 level=WARN source=runner.go:485 msg="user overrode visible devices" CUDA_VISIBLE_DEVICES=-1
time=2026-04-22T20:37:54.242+02:00 level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again"
time=2026-04-22T20:37:54.242+02:00 level=INFO source=server.go:444 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 46377"
time=2026-04-22T20:37:54.318+02:00 level=INFO source=server.go:444 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 36391"
time=2026-04-22T20:37:54.394+02:00 level=INFO source=server.go:444 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 41067"
time=2026-04-22T20:37:54.576+02:00 level=INFO source=runner.go:106 msg="experimental Vulkan support disabled.  To enable, set OLLAMA_VULKAN=1"
time=2026-04-22T20:37:54.576+02:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="23.3 GiB" available="20.4 GiB"
time=2026-04-22T20:37:54.576+02:00 level=INFO source=routes.go:1860 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096
RAW_BUFFERClick to expand / collapse

What is the issue?

Setting CUDA_VISIBLE_DEVICES=-1 to hide NVIDIA GPUs unexpectedly causes the AMD ROCm GPU to disappear from inference compute as well, leaving only CPU available.

Steps to reproduce:

  1. Start ollama serve without restrictions — both CUDA and ROCm GPUs are discovered:

    inference compute ... library=CUDA ... name=CUDA0 description="NVIDIA GeForce RTX 5070 Laptop GPU"
    inference compute ... library=ROCm ... name=ROCm0 description="AMD Radeon 890M Graphics"
  2. Stop the server, then run:

    CUDA_VISIBLE_DEVICES=-1 ollama serve
  3. Observe that only CPU is listed under inference compute; the ROCm GPU is no longer detected.

Expected behavior:

CUDA_VISIBLE_DEVICES should only affect CUDA-visible devices. The AMD iGPU (Radeon 890M) should still be discovered via ROCm and available for inference when this variable is set.

Actual behavior:

The ROCm GPU is completely omitted from the compute list, forcing CPU-only inference despite the AMD iGPU being present and functional.

Relevant log output

galuszkak@Thunder:~$ ollama serve 
time=2026-04-22T20:37:41.008+02:00 level=INFO source=routes.go:1752 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/galuszkak/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
[...]
time=2026-04-22T20:37:42.087+02:00 level=INFO source=types.go:42 msg="inference compute" id=GPU-0eee6825-5aa8-a022-085a-6c135272070d filter_id="" library=CUDA compute=12.0 name=CUDA0 description="NVIDIA GeForce RTX 5070 Laptop GPU" libdirs=ollama,cuda_v13 driver=13.0 pci_id=0000:c1:00.0 type=discrete total="8.0 GiB" available="7.2 GiB"
time=2026-04-22T20:37:42.087+02:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=ROCm compute=gfx1150 name=ROCm0 description="AMD Radeon 890M Graphics" libdirs=ollama,rocm driver=70253.21 pci_id=0000:c2:00.0 type=iGPU total="35.6 GiB" available="35.4 GiB"
time=2026-04-22T20:37:42.087+02:00 level=INFO source=routes.go:1860 msg="vram-based default context" total_vram="43.6 GiB" default_num_ctx=32768


and here with CUDA_VISIBLE_DEVICES=-1:

galuszkak@Thunder:~$ CUDA_VISIBLE_DEVICES=-1 ollama serve 
time=2026-04-22T20:37:54.239+02:00 level=INFO source=routes.go:1752 msg="server config" env="map[CUDA_VISIBLE_DEVICES:-1 GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/galuszkak/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
[...]
time=2026-04-22T20:37:54.241+02:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-04-22T20:37:54.241+02:00 level=WARN source=runner.go:485 msg="user overrode visible devices" CUDA_VISIBLE_DEVICES=-1
time=2026-04-22T20:37:54.242+02:00 level=WARN source=runner.go:489 msg="if GPUs are not correctly discovered, unset and try again"
time=2026-04-22T20:37:54.242+02:00 level=INFO source=server.go:444 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 46377"
time=2026-04-22T20:37:54.318+02:00 level=INFO source=server.go:444 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 36391"
time=2026-04-22T20:37:54.394+02:00 level=INFO source=server.go:444 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 41067"
time=2026-04-22T20:37:54.576+02:00 level=INFO source=runner.go:106 msg="experimental Vulkan support disabled.  To enable, set OLLAMA_VULKAN=1"
time=2026-04-22T20:37:54.576+02:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="23.3 GiB" available="20.4 GiB"
time=2026-04-22T20:37:54.576+02:00 level=INFO source=routes.go:1860 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096

OS

Linux

GPU

Nvidia, AMD

CPU

AMD

Ollama version

0.21.0

extent analysis

TL;DR

Setting CUDA_VISIBLE_DEVICES=-1 incorrectly hides the AMD ROCm GPU, so try setting ROCR_VISIBLE_DEVICES explicitly to include the AMD GPU.

Guidance

  • Verify that the AMD ROCm GPU is visible without setting CUDA_VISIBLE_DEVICES by checking the log output for a line containing library=ROCm.
  • Set ROCR_VISIBLE_DEVICES to include the AMD GPU, for example, ROCR_VISIBLE_DEVICES=0, to ensure it is visible even when CUDA_VISIBLE_DEVICES=-1.
  • Check the log output for a line containing library=ROCm after setting ROCR_VISIBLE_DEVICES to confirm the AMD GPU is detected.
  • If the issue persists, try unsetting CUDA_VISIBLE_DEVICES and verify that the AMD ROCm GPU is detected.

Example

ROCR_VISIBLE_DEVICES=0 CUDA_VISIBLE_DEVICES=-1 ollama serve

This command sets ROCR_VISIBLE_DEVICES to include the AMD GPU (assuming it is device 0) and hides the NVIDIA GPU by setting CUDA_VISIBLE_DEVICES=-1.

Notes

The issue seems to be related to the interaction between CUDA_VISIBLE_DEVICES and ROCR_VISIBLE_DEVICES. Setting ROCR_VISIBLE_DEVICES explicitly may help resolve the issue. However, the root cause of the problem is unclear and may require further investigation.

Recommendation

Apply the workaround by setting ROCR_VISIBLE_DEVICES explicitly to include the AMD GPU, as this may resolve the issue without requiring a code change or upgrade.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix CUDA_VISIBLE_DEVICES=-1 incorrectly suppresses ROCm GPU detection [1 participants]