ollama - 💡(How to fix) Fix Ollama 19.0 does not recognize VRAM on Manjaro Linux and RTX4060 [4 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15179Fetched 2026-04-08 01:58:15
View on GitHub
Comments
4
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
commented ×4closed ×1labeled ×1

Code Example

OLLAMA_DEBUG-2 log
time=2026-03-31T21:29:28.743+02:00 level=INFO source=routes.go:1742 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG-4 OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/xxxxxx/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-03-31T21:29:28.743+02:00 level=INFO source=routes.go:1744 msg="Ollama cloud disabled: false"
time=2026-03-31T21:29:28.746+02:00 level=INFO source=images.go:477 msg="total blobs: 70"
time=2026-03-31T21:29:28.747+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2026-03-31T21:29:28.747+02:00 level=INFO source=routes.go:1800 msg="Listening on 127.0.0.1:11434 (version 0.19.0)"
time=2026-03-31T21:29:28.747+02:00 level=DEBUG source=sched.go:145 msg="starting llm scheduler"
time=2026-03-31T21:29:28.747+02:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-03-31T21:29:28.748+02:00 level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" extraEnvs=map[]
time=2026-03-31T21:29:28.748+02:00 level=INFO source=server.go:432 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 41259"
time=2026-03-31T21:29:28.748+02:00 level=DEBUG source=server.go:433 msg=subprocess CUDA_PATH=/opt/cuda PATH=/home/xxxxxx/.npm-global/bin:/home/xxxxxx/bin:/home/xxxxxx/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/usr/lib/rustup/bin OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12
time=2026-03-31T21:29:28.757+02:00 level=INFO source=runner.go:1411 msg="starting ollama engine"
time=2026-03-31T21:29:28.757+02:00 level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:41259"
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
time=2026-03-31T21:29:28.760+02:00 level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v12
time=2026-03-31T21:29:28.762+02:00 level=INFO source=ggml.go:104 msg=system CPU.0.LLAMAFILE=1 compiler=cgo(gcc)
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=2.251299ms
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=310ns
time=2026-03-31T21:29:28.762+02:00 level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" devices=[]
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=14.54122ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" extra_envs=map[]
time=2026-03-31T21:29:28.762+02:00 level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extraEnvs=map[]
time=2026-03-31T21:29:28.762+02:00 level=INFO source=server.go:432 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 39517"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=server.go:433 msg=subprocess CUDA_PATH=/opt/cuda PATH=/home/xxxxxx/.npm-global/bin:/home/xxxxxx/bin:/home/xxxxxx/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/usr/lib/rustup/bin OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v13
time=2026-03-31T21:29:28.771+02:00 level=INFO source=runner.go:1411 msg="starting ollama engine"
time=2026-03-31T21:29:28.771+02:00 level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:39517"
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
time=2026-03-31T21:29:28.774+02:00 level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v13
time=2026-03-31T21:29:28.775+02:00 level=INFO source=ggml.go:104 msg=system CPU.0.LLAMAFILE=1 compiler=cgo(gcc)
time=2026-03-31T21:29:28.775+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=2.168022ms
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=271ns
time=2026-03-31T21:29:28.776+02:00 level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" devices=[]
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=13.719426ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extra_envs=map[]
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=0
time=2026-03-31T21:29:28.776+02:00 level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[]
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=28.459973ms
time=2026-03-31T21:29:28.776+02:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.7 GiB" available="57.9 GiB"
time=2026-03-31T21:29:28.776+02:00 level=INFO source=routes.go:1850 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096

nvidia-smi log:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 590.48.01              Driver Version: 590.48.01      CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:09:00.0  On |                  N/A |
|  0%   41C    P8             11W /  165W |     642MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            1440      G   /usr/lib/Xorg                            37MiB |
|    0   N/A  N/A            1877      G   /usr/bin/ksecretd                         2MiB |
|    0   N/A  N/A            1936      G   /usr/bin/kwin_wayland                    39MiB |
|    0   N/A  N/A            2021      G   /usr/bin/Xwayland                         9MiB |
|    0   N/A  N/A            2069      G   /usr/bin/ksmserver                        2MiB |
|    0   N/A  N/A            2071      G   /usr/bin/kded6                            2MiB |
|    0   N/A  N/A            2085      G   /usr/bin/plasmashell                    309MiB |
|    0   N/A  N/A            2104      G   /usr/bin/kaccess                          2MiB |
|    0   N/A  N/A            2105      G   ...it-kde-authentication-agent-1          2MiB |
|    0   N/A  N/A            2254      G   /usr/bin/kdeconnectd                      2MiB |
|    0   N/A  N/A            2334      G   /usr/bin/msm_kde_notifier                 2MiB |
|    0   N/A  N/A            2338      G   /usr/bin/yakuake                          2MiB |
|    0   N/A  N/A            2367      G   /usr/bin/pamac-tray-plasma                2MiB |
|    0   N/A  N/A            2410      G   /usr/lib/xdg-desktop-portal-kde           2MiB |
|    0   N/A  N/A            2418      G   /usr/bin/konsole                          2MiB |
|    0   N/A  N/A            3649      G   /usr/bin/kwalletd6                        2MiB |
|    0   N/A  N/A            4003      G   /opt/google/chrome/chrome                 2MiB |
|    0   N/A  N/A            4758      G   /usr/lib/baloorunner                      2MiB |
+-----------------------------------------------------------------------------------------+
RAW_BUFFERClick to expand / collapse

What is the issue?

After updating ollama by using the curl method to 0.19 it does not recognize the VRAM anymore. The local models run, but no layers are offloaded to the GPU and it is very slow. Before the upgrade everything worked as expected.

CUDA version: 13.1.

Driver version: 590.48.01

Relevant log output

OLLAMA_DEBUG-2 log
time=2026-03-31T21:29:28.743+02:00 level=INFO source=routes.go:1742 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG-4 OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/xxxxxx/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-03-31T21:29:28.743+02:00 level=INFO source=routes.go:1744 msg="Ollama cloud disabled: false"
time=2026-03-31T21:29:28.746+02:00 level=INFO source=images.go:477 msg="total blobs: 70"
time=2026-03-31T21:29:28.747+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2026-03-31T21:29:28.747+02:00 level=INFO source=routes.go:1800 msg="Listening on 127.0.0.1:11434 (version 0.19.0)"
time=2026-03-31T21:29:28.747+02:00 level=DEBUG source=sched.go:145 msg="starting llm scheduler"
time=2026-03-31T21:29:28.747+02:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-03-31T21:29:28.748+02:00 level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" extraEnvs=map[]
time=2026-03-31T21:29:28.748+02:00 level=INFO source=server.go:432 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 41259"
time=2026-03-31T21:29:28.748+02:00 level=DEBUG source=server.go:433 msg=subprocess CUDA_PATH=/opt/cuda PATH=/home/xxxxxx/.npm-global/bin:/home/xxxxxx/bin:/home/xxxxxx/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/usr/lib/rustup/bin OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v12
time=2026-03-31T21:29:28.757+02:00 level=INFO source=runner.go:1411 msg="starting ollama engine"
time=2026-03-31T21:29:28.757+02:00 level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:41259"
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
time=2026-03-31T21:29:28.760+02:00 level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
time=2026-03-31T21:29:28.760+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v12
time=2026-03-31T21:29:28.762+02:00 level=INFO source=ggml.go:104 msg=system CPU.0.LLAMAFILE=1 compiler=cgo(gcc)
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=2.251299ms
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=310ns
time=2026-03-31T21:29:28.762+02:00 level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" devices=[]
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=14.54122ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v12]" extra_envs=map[]
time=2026-03-31T21:29:28.762+02:00 level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extraEnvs=map[]
time=2026-03-31T21:29:28.762+02:00 level=INFO source=server.go:432 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 39517"
time=2026-03-31T21:29:28.762+02:00 level=DEBUG source=server.go:433 msg=subprocess CUDA_PATH=/opt/cuda PATH=/home/xxxxxx/.npm-global/bin:/home/xxxxxx/bin:/home/xxxxxx/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/usr/lib/rustup/bin OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/lib/ollama/cuda_v13
time=2026-03-31T21:29:28.771+02:00 level=INFO source=runner.go:1411 msg="starting ollama engine"
time=2026-03-31T21:29:28.771+02:00 level=INFO source=runner.go:1446 msg="Server listening on 127.0.0.1:39517"
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.alignment default=32
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.file_type default=0
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.name default=""
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=general.description default=""
time=2026-03-31T21:29:28.774+02:00 level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
time=2026-03-31T21:29:28.774+02:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama/cuda_v13
time=2026-03-31T21:29:28.775+02:00 level=INFO source=ggml.go:104 msg=system CPU.0.LLAMAFILE=1 compiler=cgo(gcc)
time=2026-03-31T21:29:28.775+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.pooling_type default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.expert_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.block_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.embedding_length default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.key_length default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=ggml.go:324 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:1386 msg="dummy model load took" duration=2.168022ms
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:1391 msg="gathering device infos took" duration=271ns
time=2026-03-31T21:29:28.776+02:00 level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" devices=[]
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=13.719426ms OLLAMA_LIBRARY_PATH="[/usr/local/lib/ollama /usr/local/lib/ollama/cuda_v13]" extra_envs=map[]
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=0
time=2026-03-31T21:29:28.776+02:00 level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[]
time=2026-03-31T21:29:28.776+02:00 level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=28.459973ms
time=2026-03-31T21:29:28.776+02:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.7 GiB" available="57.9 GiB"
time=2026-03-31T21:29:28.776+02:00 level=INFO source=routes.go:1850 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096

nvidia-smi log:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 590.48.01              Driver Version: 590.48.01      CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:09:00.0  On |                  N/A |
|  0%   41C    P8             11W /  165W |     642MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            1440      G   /usr/lib/Xorg                            37MiB |
|    0   N/A  N/A            1877      G   /usr/bin/ksecretd                         2MiB |
|    0   N/A  N/A            1936      G   /usr/bin/kwin_wayland                    39MiB |
|    0   N/A  N/A            2021      G   /usr/bin/Xwayland                         9MiB |
|    0   N/A  N/A            2069      G   /usr/bin/ksmserver                        2MiB |
|    0   N/A  N/A            2071      G   /usr/bin/kded6                            2MiB |
|    0   N/A  N/A            2085      G   /usr/bin/plasmashell                    309MiB |
|    0   N/A  N/A            2104      G   /usr/bin/kaccess                          2MiB |
|    0   N/A  N/A            2105      G   ...it-kde-authentication-agent-1          2MiB |
|    0   N/A  N/A            2254      G   /usr/bin/kdeconnectd                      2MiB |
|    0   N/A  N/A            2334      G   /usr/bin/msm_kde_notifier                 2MiB |
|    0   N/A  N/A            2338      G   /usr/bin/yakuake                          2MiB |
|    0   N/A  N/A            2367      G   /usr/bin/pamac-tray-plasma                2MiB |
|    0   N/A  N/A            2410      G   /usr/lib/xdg-desktop-portal-kde           2MiB |
|    0   N/A  N/A            2418      G   /usr/bin/konsole                          2MiB |
|    0   N/A  N/A            3649      G   /usr/bin/kwalletd6                        2MiB |
|    0   N/A  N/A            4003      G   /opt/google/chrome/chrome                 2MiB |
|    0   N/A  N/A            4758      G   /usr/lib/baloorunner                      2MiB |
+-----------------------------------------------------------------------------------------+

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.19.0

extent analysis

TL;DR

The issue is likely due to a mismatch between the CUDA version and the ollama version, causing the GPU to not be recognized, and can be fixed by ensuring compatibility between the CUDA and ollama versions.

Guidance

  • Check the CUDA version compatibility with the ollama version 0.19.0 to ensure they are compatible.
  • Verify that the GPU is properly installed and recognized by the system by checking the nvidia-smi output.
  • Consider reinstalling or updating the CUDA drivers to match the ollama version requirements.
  • Review the ollama configuration files to ensure that the GPU is properly configured and recognized.

Example

No specific code example is provided as the issue seems to be related to configuration and compatibility rather than code.

Notes

The issue may be specific to the combination of ollama version 0.19.0 and CUDA version 13.1, and may require updating or reinstalling the CUDA drivers or ollama to a compatible version.

Recommendation

Apply a workaround by checking the compatibility of the CUDA version with the ollama version and updating or reinstalling the CUDA drivers as needed, as the issue seems to be related to a compatibility problem rather than a code issue that requires an upgrade to a fixed version.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING