ollama - ✅(Solved) Fix rocblaslt error: Cannot read "TensileLibrary_lazy_gfx1200.dat": No such file or directory [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

time=2026-04-09T14:34:01.253Z level=INFO source=sched.go:484 msg="system memory" total="30.6 GiB" free="15.4 GiB" free_swap="0 B" time=2026-04-09T14:34:01.253Z level=INFO source=sched.go:491 msg="gpu memory" id=GPU-44d2e7be1f7e4650 library=ROCm available="15.4 GiB" free="15.9 GiB" minimum="457.0 MiB" overhead="0 B" time=2026-04-09T14:34:01.253Z level=INFO source=server.go:771 msg="loading model" "model layers"=49 requested=-1 time=2026-04-09T14:34:01.266Z level=INFO source=runner.go:1417 msg="starting ollama engine" time=2026-04-09T14:34:01.266Z level=INFO source=runner.go:1452 msg="Server listening on 127.0.0.1:39699" time=2026-04-09T14:34:01.275Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:49[ID:GPU-44d2e7be1f7e4650 Layers:49(0..48)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2026-04-09T14:34:01.305Z level=INFO source=ggml.go:136 msg="" architecture=qwen3moe file_type=Q4_K_M name="Qwen3 Coder 30B A3B Instruct" description="" num_tensors=579 num_key_values=35 load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-haswell.so ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 ROCm devices: Device 0: AMD Radeon Graphics, gfx1200 (0x1200), VMM: no, Wave Size: 32, ID: GPU-44d2e7be1f7e4650 load_backend: loaded ROCm backend from /usr/lib/ollama/rocm/libggml-hip.so time=2026-04-09T14:34:03.289Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 ROCm.0.NO_VMM=1 ROCm.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)

rocblaslt error: Cannot read "TensileLibrary_lazy_gfx1200.dat": No such file or directory

rocblaslt error: Could not load "TensileLibrary_lazy_gfx1200.dat" time=2026-04-09T14:34:05.908Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:43[ID:GPU-44d2e7be1f7e4650 Layers:43(5..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2026-04-09T14:34:06.001Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2026-04-09T14:34:06.087Z level=INFO source=runner.go:1290 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2026-04-09T14:34:06.219Z level=INFO source=runner.go:1290 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:482 msg="offloading 42 repeating layers to GPU" time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:486 msg="offloading output layer to CPU" time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:494 msg="offloaded 42/49 layers to GPU"

PR fix notes

PR #15527: llm: set ROCm Tensile lib path

Description (problem / solution / changelog)

Resolve rocBLAS Tensile path from bundled ROCm libs and pass it to the runner when unset. Add tests for path detection and env injection. rocblaslt error: Cannot read "TensileLibrary_lazy_gfx1200.dat": No such file or directory #15452

Summary

Fix ROCm Tensile library path resolution for runner startup on gfx1200-class systems.

On some ROCm setups, rocBLASLt first attempts a relative lookup for TensileLibrary_lazy_gfx1200.dat, logs a "No such file or directory" warning, and then falls back to an absolute path. Inference may still work, but startup is noisy and path resolution is fragile.

This patch makes lookup deterministic by resolving the Tensile library directory from gpuLibs and setting env vars at runner startup when they are not already provided.

Changes

  • In llm/server.go:
    • Add findRocmTensileLibraryPath(gpuLibs []string) string
    • Add ensureRocmTensileEnv(gpuLibs []string, extraEnvs map[string]string) map[string]string
    • Call ensureRocmTensileEnv(...) in StartRunner(...) before environment assembly
  • Inject (only when unset):
    • ROCBLAS_TENSILE_LIBPATH
    • ROCBLASLT_TENSILE_LIBPATH
    • HIPBLASLT_TENSILE_LIBPATH
  • Supported candidate layouts:
    • <lib>/rocblas/library
    • <lib>/rocm/rocblas/library
  • Do not override user/process-provided env values.

Tests

Added unit tests in llm/server_test.go for:

  • Tensile path detection from both directory layouts
  • Injection behavior when env vars are unset
  • Non-overwrite behavior when env vars are explicitly set

Why this change

This addresses path-resolution fragility behind issue #15452 and removes noisy false-negative file lookup behavior in runner startup.

Compatibility

  • No behavior change when env vars are already set by user/system.
  • No injection when a valid Tensile directory is not found.
  • Complements packaging-side fixes (e.g. hipBLASLt kernel packaging), but does not replace them.

Closes #15452

Changed files

  • CMakeLists.txt (modified, +5/-0)

Code Example

time=2026-04-09T14:34:01.253Z level=INFO source=sched.go:484 msg="system memory" total="30.6 GiB" free="15.4 GiB" free_swap="0 B"
time=2026-04-09T14:34:01.253Z level=INFO source=sched.go:491 msg="gpu memory" id=GPU-44d2e7be1f7e4650 library=ROCm available="15.4 GiB" free="15.9 GiB" minimum="457.0 MiB" overhead="0 B"
time=2026-04-09T14:34:01.253Z level=INFO source=server.go:771 msg="loading model" "model layers"=49 requested=-1
time=2026-04-09T14:34:01.266Z level=INFO source=runner.go:1417 msg="starting ollama engine"
time=2026-04-09T14:34:01.266Z level=INFO source=runner.go:1452 msg="Server listening on 127.0.0.1:39699"
time=2026-04-09T14:34:01.275Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:49[ID:GPU-44d2e7be1f7e4650 Layers:49(0..48)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:01.305Z level=INFO source=ggml.go:136 msg="" architecture=qwen3moe file_type=Q4_K_M name="Qwen3 Coder 30B A3B Instruct" description="" num_tensors=579 num_key_values=35
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-haswell.so
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon Graphics, gfx1200 (0x1200), VMM: no, Wave Size: 32, ID: GPU-44d2e7be1f7e4650
load_backend: loaded ROCm backend from /usr/lib/ollama/rocm/libggml-hip.so
time=2026-04-09T14:34:03.289Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 ROCm.0.NO_VMM=1 ROCm.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)

rocblaslt error: Cannot read "TensileLibrary_lazy_gfx1200.dat": No such file or directory

rocblaslt error: Could not load "TensileLibrary_lazy_gfx1200.dat"
time=2026-04-09T14:34:05.908Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:43[ID:GPU-44d2e7be1f7e4650 Layers:43(5..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:06.001Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:06.087Z level=INFO source=runner.go:1290 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:06.219Z level=INFO source=runner.go:1290 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:482 msg="offloading 42 repeating layers to GPU"
time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:486 msg="offloading output layer to CPU"
time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:494 msg="offloaded 42/49 layers to GPU"
RAW_BUFFERClick to expand / collapse

What is the issue?

I'm getting an error with ollama 0.20.4 and Radeon RX 9060 XT. I'm getting the same error in ollama from docker (ollama/ollama:rocm) and ollama started from systemd (installed from https://ollama.com/install.sh).

I've used strace on ollama process - and while it initially does not find the file, it is able to access it eventually:

[pid 11959] newfstatat(AT_FDCWD, "TensileLibrary_lazy_gfx1200.dat", 0x7edce76fa260, 0) = -1 ENOENT (No such file or directory) [pid 11959] write(2, ""TensileLibrary_lazy_gfx1200.dat"..., 33) = 33 [pid 11959] openat(AT_FDCWD, "TensileLibrary_lazy_gfx1200.dat", O_RDONLY <unfinished ...> [pid 11959] write(2, ""TensileLibrary_lazy_gfx1200.dat"..., 33 <unfinished ...> [pid 11959] access("/usr/local/lib/ollama/rocm/rocblas/library/TensileLibrary_lazy_gfx1200.dat", R_OK) = 0 [pid 11963] openat(AT_FDCWD, "/usr/local/lib/ollama/rocm/rocblas/library/TensileLibrary_lazy_gfx1200.dat", O_RDONLY) = 10

ollama log below:

Relevant log output

time=2026-04-09T14:34:01.253Z level=INFO source=sched.go:484 msg="system memory" total="30.6 GiB" free="15.4 GiB" free_swap="0 B"
time=2026-04-09T14:34:01.253Z level=INFO source=sched.go:491 msg="gpu memory" id=GPU-44d2e7be1f7e4650 library=ROCm available="15.4 GiB" free="15.9 GiB" minimum="457.0 MiB" overhead="0 B"
time=2026-04-09T14:34:01.253Z level=INFO source=server.go:771 msg="loading model" "model layers"=49 requested=-1
time=2026-04-09T14:34:01.266Z level=INFO source=runner.go:1417 msg="starting ollama engine"
time=2026-04-09T14:34:01.266Z level=INFO source=runner.go:1452 msg="Server listening on 127.0.0.1:39699"
time=2026-04-09T14:34:01.275Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:49[ID:GPU-44d2e7be1f7e4650 Layers:49(0..48)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:01.305Z level=INFO source=ggml.go:136 msg="" architecture=qwen3moe file_type=Q4_K_M name="Qwen3 Coder 30B A3B Instruct" description="" num_tensors=579 num_key_values=35
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-haswell.so
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon Graphics, gfx1200 (0x1200), VMM: no, Wave Size: 32, ID: GPU-44d2e7be1f7e4650
load_backend: loaded ROCm backend from /usr/lib/ollama/rocm/libggml-hip.so
time=2026-04-09T14:34:03.289Z level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 ROCm.0.NO_VMM=1 ROCm.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)

rocblaslt error: Cannot read "TensileLibrary_lazy_gfx1200.dat": No such file or directory

rocblaslt error: Could not load "TensileLibrary_lazy_gfx1200.dat"
time=2026-04-09T14:34:05.908Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:43[ID:GPU-44d2e7be1f7e4650 Layers:43(5..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:06.001Z level=INFO source=runner.go:1290 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:06.087Z level=INFO source=runner.go:1290 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:06.219Z level=INFO source=runner.go:1290 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:4 GPULayers:42[ID:GPU-44d2e7be1f7e4650 Layers:42(6..47)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:482 msg="offloading 42 repeating layers to GPU"
time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:486 msg="offloading output layer to CPU"
time=2026-04-09T14:34:06.219Z level=INFO source=ggml.go:494 msg="offloaded 42/49 layers to GPU"

OS

Linux

GPU

AMD

CPU

Intel

Ollama version

0.20.4

extent analysis

TL;DR

The issue is likely due to a missing or inaccessible file "TensileLibrary_lazy_gfx1200.dat" required by the ROCm backend, and a workaround may involve ensuring the file is present and readable in the expected location.

Guidance

  • Verify the presence and permissions of the "TensileLibrary_lazy_gfx1200.dat" file in the "/usr/local/lib/ollama/rocm/rocblas/library" directory.
  • Check the strace output to confirm that the file is eventually accessed successfully, but the initial failure may be causing the error.
  • Ensure that the ROCm installation and configuration are correct, as the log output indicates that the ROCm backend is loaded successfully.
  • Consider checking the documentation for ollama and ROCm to see if there are any specific requirements or recommendations for the "TensileLibrary_lazy_gfx1200.dat" file.

Example

No code snippet is provided as the issue seems to be related to file presence and accessibility rather than code.

Notes

The issue may be specific to the combination of ollama version 0.20.4 and the AMD Radeon RX 9060 XT GPU, and further investigation may be required to determine the root cause.

Recommendation

Apply a workaround by ensuring the "TensileLibrary_lazy_gfx1200.dat" file is present and readable in the expected location, as the strace output suggests that the file is eventually accessed successfully.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - ✅(Solved) Fix rocblaslt error: Cannot read "TensileLibrary_lazy_gfx1200.dat": No such file or directory [1 pull requests]