ollama - 💡(How to fix) Fix Getting on "500 Internal Server Error: mlx runner failed: Error: failed to initialize MLX" on Linux

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

ollama server was started with "ollama serve" and later with

OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 ollama serve

$ ollama run x/z-image-turbo Error: failed to load model: 500 Internal Server Error: mlx runner failed: Error: failed to initialize MLX: libmlxc.so not found (exit: exit status 1)

Code Example

# ollama server was started with "ollama serve" and later with
# OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 ollama serve

$ ollama run x/z-image-turbo
Error: failed to load model: 500 Internal Server Error: mlx runner failed: Error: failed to initialize MLX: libmlxc.so not found (exit: exit status 1)

---

time=2026-05-17T21:30:14.749Z level=INFO source=routes.go:1802 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_TRANSFER_STREAMS:4 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-05-17T21:30:14.749Z level=INFO source=routes.go:1804 msg="Ollama cloud disabled: false"
time=2026-05-17T21:30:14.760Z level=INFO source=images.go:517 msg="total blobs: 0"
time=2026-05-17T21:30:14.763Z level=INFO source=images.go:524 msg="total unused blobs removed: 0"
time=2026-05-17T21:30:14.763Z level=DEBUG source=model_recommendations.go:57 msg="starting model recommendations cache" default_recommendations=6 refresh_interval=4h0m0s fetch_timeout=3s
time=2026-05-17T21:30:14.763Z level=DEBUG source=model_show_cache.go:125 msg="starting model show cache"
time=2026-05-17T21:30:14.763Z level=INFO source=routes.go:1864 msg="Listening on 127.0.0.1:11434 (version 0.24.0)"
time=2026-05-17T21:30:14.763Z level=DEBUG source=sched.go:145 msg="starting llm scheduler"
time=2026-05-17T21:30:14.763Z level=DEBUG source=model_recommendations.go:262 msg="loaded model recommendations snapshot" path=/root/.ollama/cache/model-recommendations.json count=7
time=2026-05-17T21:30:14.764Z level=DEBUG source=model_recommendations.go:192 msg="refreshing model recommendations from remote" url=https://ollama.com/api/experimental/model-recommendations
time=2026-05-17T21:30:14.764Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-05-17T21:30:14.764Z level=INFO source=runner.go:106 msg="experimental Vulkan support disabled.  To enable, set OLLAMA_VULKAN=1"
time=2026-05-17T21:30:14.764Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 36061"
time=2026-05-17T21:30:14.764Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v12 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v12
time=2026-05-17T21:30:14.900Z level=DEBUG source=model_recommendations.go:225 msg="model recommendations refreshed" count=7
time=2026-05-17T21:30:14.901Z level=DEBUG source=model_recommendations.go:302 msg="persisted model recommendations snapshot" path=/root/.ollama/cache/model-recommendations.json count=7
time=2026-05-17T21:30:14.901Z level=INFO source=model_recommendations.go:177 msg="model recommendations cache sleep scheduled" wait=3h18m13.6179638s consecutive_failures=0
time=2026-05-17T21:30:14.919Z level=DEBUG source=runner.go:433 msg="bootstrap discovery took" duration=154.969084ms OLLAMA_LIBRARY_PATH="[/opt/ollama/lib/ollama /opt/ollama/lib/ollama/cuda_v12]" extra_envs=map[]
time=2026-05-17T21:30:14.919Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 44275"
time=2026-05-17T21:30:14.919Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:15.062Z level=DEBUG source=runner.go:433 msg="bootstrap discovery took" duration=142.980362ms OLLAMA_LIBRARY_PATH="[/opt/ollama/lib/ollama /opt/ollama/lib/ollama/cuda_v13]" extra_envs=map[]
time=2026-05-17T21:30:15.062Z level=DEBUG source=runner.go:123 msg="evaluating which, if any, devices to filter out" initial_count=2
time=2026-05-17T21:30:15.062Z level=DEBUG source=runner.go:145 msg="verifying if device is supported" library=/opt/ollama/lib/ollama/cuda_v12 description="NVIDIA RTX 4000 SFF Ada Generation" compute=8.9 id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f pci_id=0000:01:00.0
time=2026-05-17T21:30:15.062Z level=DEBUG source=runner.go:145 msg="verifying if device is supported" library=/opt/ollama/lib/ollama/cuda_v13 description="NVIDIA RTX 4000 SFF Ada Generation" compute=8.9 id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f pci_id=0000:01:00.0
time=2026-05-17T21:30:15.062Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 34077"
time=2026-05-17T21:30:15.062Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v12 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v12 GGML_CUDA_INIT=1 CUDA_VISIBLE_DEVICES=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f
time=2026-05-17T21:30:15.062Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 34169"
time=2026-05-17T21:30:15.062Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13 GGML_CUDA_INIT=1 CUDA_VISIBLE_DEVICES=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f
time=2026-05-17T21:30:15.192Z level=DEBUG source=runner.go:433 msg="bootstrap discovery took" duration=130.762382ms OLLAMA_LIBRARY_PATH="[/opt/ollama/lib/ollama /opt/ollama/lib/ollama/cuda_v13]" extra_envs="map[CUDA_VISIBLE_DEVICES:GPU-2391f19b-8d42-84e6-7831-441cd31ee84f GGML_CUDA_INIT:1]"
time=2026-05-17T21:30:15.196Z level=DEBUG source=runner.go:433 msg="bootstrap discovery took" duration=133.952375ms OLLAMA_LIBRARY_PATH="[/opt/ollama/lib/ollama /opt/ollama/lib/ollama/cuda_v12]" extra_envs="map[CUDA_VISIBLE_DEVICES:GPU-2391f19b-8d42-84e6-7831-441cd31ee84f GGML_CUDA_INIT:1]"
time=2026-05-17T21:30:15.196Z level=DEBUG source=runner.go:400 msg="filtering device with overlapping libraries" id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f library=/opt/ollama/lib/ollama/cuda_v12 delete_index=0 kept_library=/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:15.196Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=432.124384ms
time=2026-05-17T21:30:15.196Z level=INFO source=types.go:42 msg="inference compute" id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f filter_id="" library=CUDA compute=8.9 name=CUDA0 description="NVIDIA RTX 4000 SFF Ada Generation" libdirs=ollama,cuda_v13 driver=13.2 pci_id=0000:01:00.0 type=discrete total="20.0 GiB" available="19.5 GiB"
time=2026-05-17T21:30:15.196Z level=INFO source=routes.go:1914 msg="vram-based default context" total_vram="20.0 GiB" default_num_ctx=4096
time=2026-05-17T21:30:19.443Z level=DEBUG source=runner.go:263 msg="refreshing free memory"
time=2026-05-17T21:30:19.443Z level=DEBUG source=runner.go:327 msg="unable to refresh all GPUs with existing runners, performing bootstrap discovery"
time=2026-05-17T21:30:19.443Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 40457"
time=2026-05-17T21:30:19.443Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:19.588Z level=DEBUG source=runner.go:40 msg="overall device VRAM discovery took" duration=145.206864ms
time=2026-05-17T21:30:19.588Z level=DEBUG source=sched.go:220 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1
time=2026-05-17T21:30:19.588Z level=DEBUG source=sched.go:229 msg="loading first model" model=""
time=2026-05-17T21:30:19.588Z level=INFO source=sched.go:484 msg="system memory" total="62.3 GiB" free="60.2 GiB" free_swap="32.0 GiB"
time=2026-05-17T21:30:19.588Z level=INFO source=sched.go:491 msg="gpu memory" id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f library=CUDA available="19.1 GiB" free="19.5 GiB" minimum="457.0 MiB" overhead="0 B"
time=2026-05-17T21:30:19.592Z level=DEBUG source=server.go:200 msg="mlx subprocess library path" LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:19.592Z level=DEBUG source=server.go:207 msg="mlx subprocess library path" OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:19.592Z level=INFO source=server.go:142 msg="starting mlx runner subprocess" model=x/z-image-turbo:latest port=34055
time=2026-05-17T21:30:19.592Z level=INFO source=sched.go:561 msg="loaded runners" count=1
time=2026-05-17T21:30:19.606Z level=WARN source=server.go:135 msg=mlx-runner msg="time=2026-05-17T21:30:19.606Z level=INFO msg=\"starting mlx runner\" model=x/z-image-turbo:latest port=34055 mode=imagegen"
time=2026-05-17T21:30:19.606Z level=WARN source=server.go:135 msg=mlx-runner msg="time=2026-05-17T21:30:19.606Z level=ERROR msg=\"unable to initialize MLX\" error=\"failed to initialize MLX: libmlxc.so not found\""
time=2026-05-17T21:30:19.606Z level=WARN source=server.go:135 msg=mlx-runner msg="Error: failed to initialize MLX: libmlxc.so not found"
time=2026-05-17T21:30:19.606Z level=ERROR source=sched.go:567 msg="error loading llama server" error="mlx runner failed: Error: failed to initialize MLX: libmlxc.so not found (exit: exit status 1)"
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:569 msg="triggering expiration for failed load" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:330 msg="runner expired event received" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:345 msg="got lock to unload expired event" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:368 msg="starting background wait for VRAM recovery" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:726 msg="no need to wait for VRAM recovery" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=INFO source=server.go:388 msg="stopping mlx runner subprocess" pid=9914
time=2026-05-17T21:30:23.099Z level=DEBUG source=sched.go:161 msg="shutting down scheduler pending loop"
time=2026-05-17T21:30:24.608Z level=DEBUG source=model_recommendations.go:181 msg="stopping model recommendations cache"
time=2026-05-17T21:30:24.608Z level=DEBUG source=sched.go:377 msg="runner terminated and removed from list, blocking for VRAM recovery" runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138
RAW_BUFFERClick to expand / collapse

What is the issue?

Hi,

I'm running Ollama 0.24.0 on Rocky Linux 9.7 with NVIDIA RTX 4000 SFF Ada and, after trying to run x/z-image-turbo model, I'm getting this weird error that is, I assume, related to Apple Silicon. I'm not sure why MLX is selected on Linux.

# ollama server was started with "ollama serve" and later with
# OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 ollama serve

$ ollama run x/z-image-turbo
Error: failed to load model: 500 Internal Server Error: mlx runner failed: Error: failed to initialize MLX: libmlxc.so not found (exit: exit status 1)

Relevant log output

time=2026-05-17T21:30:14.749Z level=INFO source=routes.go:1802 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_TRANSFER_STREAMS:4 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-05-17T21:30:14.749Z level=INFO source=routes.go:1804 msg="Ollama cloud disabled: false"
time=2026-05-17T21:30:14.760Z level=INFO source=images.go:517 msg="total blobs: 0"
time=2026-05-17T21:30:14.763Z level=INFO source=images.go:524 msg="total unused blobs removed: 0"
time=2026-05-17T21:30:14.763Z level=DEBUG source=model_recommendations.go:57 msg="starting model recommendations cache" default_recommendations=6 refresh_interval=4h0m0s fetch_timeout=3s
time=2026-05-17T21:30:14.763Z level=DEBUG source=model_show_cache.go:125 msg="starting model show cache"
time=2026-05-17T21:30:14.763Z level=INFO source=routes.go:1864 msg="Listening on 127.0.0.1:11434 (version 0.24.0)"
time=2026-05-17T21:30:14.763Z level=DEBUG source=sched.go:145 msg="starting llm scheduler"
time=2026-05-17T21:30:14.763Z level=DEBUG source=model_recommendations.go:262 msg="loaded model recommendations snapshot" path=/root/.ollama/cache/model-recommendations.json count=7
time=2026-05-17T21:30:14.764Z level=DEBUG source=model_recommendations.go:192 msg="refreshing model recommendations from remote" url=https://ollama.com/api/experimental/model-recommendations
time=2026-05-17T21:30:14.764Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-05-17T21:30:14.764Z level=INFO source=runner.go:106 msg="experimental Vulkan support disabled.  To enable, set OLLAMA_VULKAN=1"
time=2026-05-17T21:30:14.764Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 36061"
time=2026-05-17T21:30:14.764Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v12 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v12
time=2026-05-17T21:30:14.900Z level=DEBUG source=model_recommendations.go:225 msg="model recommendations refreshed" count=7
time=2026-05-17T21:30:14.901Z level=DEBUG source=model_recommendations.go:302 msg="persisted model recommendations snapshot" path=/root/.ollama/cache/model-recommendations.json count=7
time=2026-05-17T21:30:14.901Z level=INFO source=model_recommendations.go:177 msg="model recommendations cache sleep scheduled" wait=3h18m13.6179638s consecutive_failures=0
time=2026-05-17T21:30:14.919Z level=DEBUG source=runner.go:433 msg="bootstrap discovery took" duration=154.969084ms OLLAMA_LIBRARY_PATH="[/opt/ollama/lib/ollama /opt/ollama/lib/ollama/cuda_v12]" extra_envs=map[]
time=2026-05-17T21:30:14.919Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 44275"
time=2026-05-17T21:30:14.919Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:15.062Z level=DEBUG source=runner.go:433 msg="bootstrap discovery took" duration=142.980362ms OLLAMA_LIBRARY_PATH="[/opt/ollama/lib/ollama /opt/ollama/lib/ollama/cuda_v13]" extra_envs=map[]
time=2026-05-17T21:30:15.062Z level=DEBUG source=runner.go:123 msg="evaluating which, if any, devices to filter out" initial_count=2
time=2026-05-17T21:30:15.062Z level=DEBUG source=runner.go:145 msg="verifying if device is supported" library=/opt/ollama/lib/ollama/cuda_v12 description="NVIDIA RTX 4000 SFF Ada Generation" compute=8.9 id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f pci_id=0000:01:00.0
time=2026-05-17T21:30:15.062Z level=DEBUG source=runner.go:145 msg="verifying if device is supported" library=/opt/ollama/lib/ollama/cuda_v13 description="NVIDIA RTX 4000 SFF Ada Generation" compute=8.9 id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f pci_id=0000:01:00.0
time=2026-05-17T21:30:15.062Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 34077"
time=2026-05-17T21:30:15.062Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v12 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v12 GGML_CUDA_INIT=1 CUDA_VISIBLE_DEVICES=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f
time=2026-05-17T21:30:15.062Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 34169"
time=2026-05-17T21:30:15.062Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13 GGML_CUDA_INIT=1 CUDA_VISIBLE_DEVICES=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f
time=2026-05-17T21:30:15.192Z level=DEBUG source=runner.go:433 msg="bootstrap discovery took" duration=130.762382ms OLLAMA_LIBRARY_PATH="[/opt/ollama/lib/ollama /opt/ollama/lib/ollama/cuda_v13]" extra_envs="map[CUDA_VISIBLE_DEVICES:GPU-2391f19b-8d42-84e6-7831-441cd31ee84f GGML_CUDA_INIT:1]"
time=2026-05-17T21:30:15.196Z level=DEBUG source=runner.go:433 msg="bootstrap discovery took" duration=133.952375ms OLLAMA_LIBRARY_PATH="[/opt/ollama/lib/ollama /opt/ollama/lib/ollama/cuda_v12]" extra_envs="map[CUDA_VISIBLE_DEVICES:GPU-2391f19b-8d42-84e6-7831-441cd31ee84f GGML_CUDA_INIT:1]"
time=2026-05-17T21:30:15.196Z level=DEBUG source=runner.go:400 msg="filtering device with overlapping libraries" id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f library=/opt/ollama/lib/ollama/cuda_v12 delete_index=0 kept_library=/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:15.196Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=432.124384ms
time=2026-05-17T21:30:15.196Z level=INFO source=types.go:42 msg="inference compute" id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f filter_id="" library=CUDA compute=8.9 name=CUDA0 description="NVIDIA RTX 4000 SFF Ada Generation" libdirs=ollama,cuda_v13 driver=13.2 pci_id=0000:01:00.0 type=discrete total="20.0 GiB" available="19.5 GiB"
time=2026-05-17T21:30:15.196Z level=INFO source=routes.go:1914 msg="vram-based default context" total_vram="20.0 GiB" default_num_ctx=4096
time=2026-05-17T21:30:19.443Z level=DEBUG source=runner.go:263 msg="refreshing free memory"
time=2026-05-17T21:30:19.443Z level=DEBUG source=runner.go:327 msg="unable to refresh all GPUs with existing runners, performing bootstrap discovery"
time=2026-05-17T21:30:19.443Z level=INFO source=server.go:433 msg="starting runner" cmd="/opt/ollama/bin/ollama runner --ollama-engine --port 40457"
time=2026-05-17T21:30:19.443Z level=DEBUG source=server.go:434 msg=subprocess OLLAMA_GPU_RUNNER=cuda OLLAMA_DEBUG=1 PATH=/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13 OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:19.588Z level=DEBUG source=runner.go:40 msg="overall device VRAM discovery took" duration=145.206864ms
time=2026-05-17T21:30:19.588Z level=DEBUG source=sched.go:220 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1
time=2026-05-17T21:30:19.588Z level=DEBUG source=sched.go:229 msg="loading first model" model=""
time=2026-05-17T21:30:19.588Z level=INFO source=sched.go:484 msg="system memory" total="62.3 GiB" free="60.2 GiB" free_swap="32.0 GiB"
time=2026-05-17T21:30:19.588Z level=INFO source=sched.go:491 msg="gpu memory" id=GPU-2391f19b-8d42-84e6-7831-441cd31ee84f library=CUDA available="19.1 GiB" free="19.5 GiB" minimum="457.0 MiB" overhead="0 B"
time=2026-05-17T21:30:19.592Z level=DEBUG source=server.go:200 msg="mlx subprocess library path" LD_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:19.592Z level=DEBUG source=server.go:207 msg="mlx subprocess library path" OLLAMA_LIBRARY_PATH=/opt/ollama/lib/ollama:/opt/ollama/lib/ollama/cuda_v13
time=2026-05-17T21:30:19.592Z level=INFO source=server.go:142 msg="starting mlx runner subprocess" model=x/z-image-turbo:latest port=34055
time=2026-05-17T21:30:19.592Z level=INFO source=sched.go:561 msg="loaded runners" count=1
time=2026-05-17T21:30:19.606Z level=WARN source=server.go:135 msg=mlx-runner msg="time=2026-05-17T21:30:19.606Z level=INFO msg=\"starting mlx runner\" model=x/z-image-turbo:latest port=34055 mode=imagegen"
time=2026-05-17T21:30:19.606Z level=WARN source=server.go:135 msg=mlx-runner msg="time=2026-05-17T21:30:19.606Z level=ERROR msg=\"unable to initialize MLX\" error=\"failed to initialize MLX: libmlxc.so not found\""
time=2026-05-17T21:30:19.606Z level=WARN source=server.go:135 msg=mlx-runner msg="Error: failed to initialize MLX: libmlxc.so not found"
time=2026-05-17T21:30:19.606Z level=ERROR source=sched.go:567 msg="error loading llama server" error="mlx runner failed: Error: failed to initialize MLX: libmlxc.so not found (exit: exit status 1)"
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:569 msg="triggering expiration for failed load" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:330 msg="runner expired event received" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:345 msg="got lock to unload expired event" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:368 msg="starting background wait for VRAM recovery" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=DEBUG source=sched.go:726 msg="no need to wait for VRAM recovery" runner.name=registry.ollama.ai/x/z-image-turbo:latest runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138 runner.num_ctx=4096
time=2026-05-17T21:30:19.606Z level=INFO source=server.go:388 msg="stopping mlx runner subprocess" pid=9914
time=2026-05-17T21:30:23.099Z level=DEBUG source=sched.go:161 msg="shutting down scheduler pending loop"
time=2026-05-17T21:30:24.608Z level=DEBUG source=model_recommendations.go:181 msg="stopping model recommendations cache"
time=2026-05-17T21:30:24.608Z level=DEBUG source=sched.go:377 msg="runner terminated and removed from list, blocking for VRAM recovery" runner.size="11.9 GiB" runner.vram="11.9 GiB" runner.parallel=1 runner.pid=9914 runner.model=digest:77b78ce4e88354eaca908f96069046975e1039be6718fbcf0626b36c70abf138

OS

Rocky Linux release 9.7 (Blue Onyx)

GPU

NVIDIA RTX 4000 SFF Ada

CPU

13th Gen Intel(R) Core(TM) i5-13500

Ollama version

0.24.0

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING