Error Message

Ollama v0.24.0 acknowledges CUDA_VISIBLE_DEVICES=1 in its startup log and even emits a WARN about it, yet proceeds to load all models and run all inference exclusively on GPU 0 (RTX 500 Ada, 4 GB), completely ignoring my eGPU 1 (RTX 5060 Ti, 16 GB eGPU).

Ollama sees the override and emits a WARN:

time=2026-05-18T08:02:19.191 level=WARN source=runner.go:536 time=2026-05-18T08:02:19.191 level=WARN source=runner.go:540

Code Example

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=..."

[Install]
WantedBy=default.target

---

[Service]
Environment="CUDA_VISIBLE_DEVICES=1"

---

# Server config — CUDA_VISIBLE_DEVICES:1 is present and correct:
time=2026-05-18T08:02:19.179 level=INFO source=routes.go:1802 msg="server config"
  env="map[CUDA_VISIBLE_DEVICES:1 ...]"

# Ollama sees the override and emits a WARN:
time=2026-05-18T08:02:19.191 level=WARN source=runner.go:536
  msg="user overrode visible devices" CUDA_VISIBLE_DEVICES=1
time=2026-05-18T08:02:19.191 level=WARN source=runner.go:540
  msg="if GPUs are not correctly discovered, unset and try again"

# Despite the above, Ollama selects GPU-802e459c = RTX 500 Ada (GPU 0):
time=2026-05-18T08:02:20.686 level=INFO source=types.go:42 msg="inference compute"
  id=GPU-802e459c-a2ec-5e73-5ba8-825cc61760cf
  filter_id=""
  library=CUDA compute=8.9
  name=CUDA0
  description="NVIDIA RTX 500 Ada Generation Laptop GPU"
  pci_id=0000:01:00.0
  total="4.0 GiB" available="3.6 GiB"

---

What is the issue?

Environment

| OS | Ubuntu Linux (kernel 7.0.0-15-generic) | | CPU | Intel Core Ultra 7 155H | | Ollama version | 0.24.0 | | NVIDIA driver | 595.58.03 | | CUDA version | 13.2 |

GPUs

NVIDIA RTX 500 Ada Generation Laptop GPU | 4 GB GDDR6
NVIDIA GeForce RTX 5060 Ti | 16 GB GDDR7 eGPU (Thunderbolt 3)

`/etc/systemd/system/ollama.service`

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=..."

[Install]
WantedBy=default.target

`/etc/systemd/system/ollama.service.d/override.conf`

[Service]
Environment="CUDA_VISIBLE_DEVICES=1"

Expected Behaviour

With CUDA_VISIBLE_DEVICES=1 set in the service environment, Ollama should

Discover only GPU 1 (RTX 5060 Ti, 16 GB) as the available CUDA device
Load models onto the RTX 5060 Ti
Run all inference on the RTX 5060 Ti

Actual Behaviour

Ollama ignores CUDA_VISIBLE_DEVICES=1 and uses GPU 0 (RTX 500 Ada, 4 GB) instead.

Evidence

1 — Ollama startup log acknowledges `CUDA_VISIBLE_DEVICES=1` then ignores it

# Server config — CUDA_VISIBLE_DEVICES:1 is present and correct:
time=2026-05-18T08:02:19.179 level=INFO source=routes.go:1802 msg="server config"
  env="map[CUDA_VISIBLE_DEVICES:1 ...]"

# Ollama sees the override and emits a WARN:
time=2026-05-18T08:02:19.191 level=WARN source=runner.go:536
  msg="user overrode visible devices" CUDA_VISIBLE_DEVICES=1
time=2026-05-18T08:02:19.191 level=WARN source=runner.go:540
  msg="if GPUs are not correctly discovered, unset and try again"

# Despite the above, Ollama selects GPU-802e459c = RTX 500 Ada (GPU 0):
time=2026-05-18T08:02:20.686 level=INFO source=types.go:42 msg="inference compute"
  id=GPU-802e459c-a2ec-5e73-5ba8-825cc61760cf
  filter_id=""
  library=CUDA compute=8.9
  name=CUDA0
  description="NVIDIA RTX 500 Ada Generation Laptop GPU"
  pci_id=0000:01:00.0
  total="4.0 GiB" available="3.6 GiB"

The RTX 5060 Ti (GPU 1, GPU-4f2cde50-643c-cc25-229e-c68abe8775bd) never appears in the inference compute log line.

Reproduction Steps

System has two NVIDIA GPUs: GPU 0 = smaller/internal, GPU 1 = larger/eGPU.
Set CUDA_VISIBLE_DEVICES=1 in the Ollama systemd service environment (override.conf or inline).
Restart the Ollama service: sudo systemctl restart ollama.
Confirm the env var is in the process environment: sudo cat /proc/$(pidof ollama)/environ | tr '\0' '\n' | grep CUDA.
Run any inference: ollama run ministral-3:3b "Hello".
Monitor GPU usage: nvidia-smi --query-gpu=index,name,memory.used,utilization.gpu --format=csv.

Observed: GPU 0 activates. GPU 1 stays idle.
Expected: GPU 1 activates. GPU 0 stays idle.

Models tested

Model	Size
`ministral-3:3b`	3.0 GB
`mistral-nemo:latest`	7.1 GB

Both loaded onto GPU 0 (4 GB VRAM) despite GPU 1 having 16 GB available on the eGPU.

Relevant log output

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.24.0

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix `CUDA_VISIBLE_DEVICES` ignored with eGPU (RTX 5060 Ti)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Ollama sees the override and emits a WARN:

Code Example

What is the issue?

Environment

GPUs

`/etc/systemd/system/ollama.service`

`/etc/systemd/system/ollama.service.d/override.conf`

Expected Behaviour

Actual Behaviour

Evidence

1 — Ollama startup log acknowledges `CUDA_VISIBLE_DEVICES=1` then ignores it

Reproduction Steps

Models tested

Relevant log output

OS

GPU

CPU

Ollama version

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix `CUDA_VISIBLE_DEVICES` ignored with eGPU (RTX 5060 Ti)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Ollama sees the override and emits a WARN:

Code Example

What is the issue?

Environment

GPUs

/etc/systemd/system/ollama.service

/etc/systemd/system/ollama.service.d/override.conf

Expected Behaviour

Actual Behaviour

Evidence

1 — Ollama startup log acknowledges CUDA_VISIBLE_DEVICES=1 then ignores it

Reproduction Steps

Models tested

Relevant log output

OS

GPU

CPU

Ollama version

Still need to ship something?

RELATED_DISCOVERY

TRENDING

`/etc/systemd/system/ollama.service`

`/etc/systemd/system/ollama.service.d/override.conf`

1 — Ollama startup log acknowledges `CUDA_VISIBLE_DEVICES=1` then ignores it