ollama - 💡(How to fix) Fix Gemma4 it 26B-4AB default conversion from original tensorfiles crash at inference. [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15348Fetched 2026-04-08 02:52:25
View on GitHub
Comments
3
Participants
2
Timeline
9
Reactions
0
Author
Participants
Timeline (top)
commented ×3labeled ×2subscribed ×2mentioned ×1

Code Example

time=2026-04-05T16:24:03.311Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=gemma4.pooling_type default=0
time=2026-04-05T16:24:03.311Z level=TRACE source=runner.go:480 msg="forwardBatch no pending batch detected" batchID=0
time=2026-04-05T16:24:03.518Z level=INFO source=server.go:1390 msg="llama runner started in 42.20 seconds"
time=2026-04-05T16:24:03.518Z level=DEBUG source=sched.go:573 msg="finished setting up" runner.name=registry.ollama.ai/rjmalagon/gemma4:26b-a4b-bf16 runner.inference="[{ID:0 Library:ROCm}]" runner.size="68.3 GiB" runner.vram="68.3 GiB" runner.parallel=1 runner.pid=82 runner.model=/root/.ollama/models/blobs/sha256-82aee8b779e33a43efb75e4c7a4ff25ce05cd256a8cae73343f1fe6a4b5afa2f runner.num_ctx=262144
[GIN] 2026/04/05 - 16:24:03 | 200 | 42.667543514s |       127.0.0.1 | POST     "/api/generate"
time=2026-04-05T16:24:03.524Z level=DEBUG source=sched.go:581 msg="context for request finished"
time=2026-04-05T16:24:03.524Z level=DEBUG source=sched.go:309 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/rjmalagon/gemma4:26b-a4b-bf16 runner.inference="[{ID:0 Library:ROCm}]" runner.size="68.3 GiB" runner.vram="68.3 GiB" runner.parallel=1 runner.pid=82 runner.model=/root/.ollama/models/blobs/sha256-82aee8b779e33a43efb75e4c7a4ff25ce05cd256a8cae73343f1fe6a4b5afa2f runner.num_ctx=262144 duration=5m0s
time=2026-04-05T16:24:03.524Z level=DEBUG source=sched.go:327 msg="after processing request finished event" runner.name=registry.ollama.ai/rjmalagon/gemma4:26b-a4b-bf16 runner.inference="[{ID:0 Library:ROCm}]" runner.size="68.3 GiB" runner.vram="68.3 GiB" runner.parallel=1 runner.pid=82 runner.model=/root/.ollama/models/blobs/sha256-82aee8b779e33a43efb75e4c7a4ff25ce05cd256a8cae73343f1fe6a4b5afa2f runner.num_ctx=262144 refCount=0
time=2026-04-05T16:24:29.480Z level=DEBUG source=sched.go:672 msg="evaluating already loaded" model=/root/.ollama/models/blobs/sha256-82aee8b779e33a43efb75e4c7a4ff25ce05cd256a8cae73343f1fe6a4b5afa2f
time=2026-04-05T16:24:29.574Z level=TRACE source=bytepairencoding.go:287 msg=encoded string="<bos><|turn>system\n<|think|><turn|>\n<|turn>user\ntest<turn|>\n<|turn>model\n" ids="[2 105 9731 107 98 106 107 105 2364 107 2181 106 107 105 4368 107]"
time=2026-04-05T16:24:29.575Z level=DEBUG source=server.go:1538 msg="completion request" images=0 prompt=73 format=""
time=2026-04-05T16:24:29.575Z level=TRACE source=server.go:1539 msg="completion request" prompt="<bos><|turn>system\n<|think|><turn|>\n<|turn>user\ntest<turn|>\n<|turn>model\n"
time=2026-04-05T16:24:29.708Z level=TRACE source=bytepairencoding.go:287 msg=encoded string="<bos><|turn>system\n<|think|><turn|>\n<|turn>user\ntest<turn|>\n<|turn>model\n" ids="[2 105 9731 107 98 106 107 105 2364 107 2181 106 107 105 4368 107]"
time=2026-04-05T16:24:29.709Z level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=0 prompt=16 used=0 remaining=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=1 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=2 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iB_No response_atch=0 i+1=3 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=4 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=5 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=6 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=7 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=8 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=9 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=10 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=11 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=12 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=13 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=14 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=15 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=16 len(seq.inputs)=16
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:475 msg="forwardBatch waiting for compute to start" pendingBatch.id=0
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:644 msg="computeBatch: waiting for inputs to be ready" batchID=0
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:646 msg="computeBatch: inputs are ready" batchID=0
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:718 msg="computeBatch: signaling computeStartedCh" batchID=0
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:477 msg="forwardBatch compute started, setting up next batch" pendingBatch.id=0 id=1
time=2026-04-05T16:24:29.717Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=1 seqIdx=0 seq.iBatch=0 i+1=1 len(seq.inputs)=1
time=2026-04-05T16:24:29.719Z level=TRACE source=runner.go:475 msg="forwardBatch waiting for compute to start" pendingBatch.id=1
time=2026-04-05T16:24:29.719Z level=TRACE source=runner.go:644 msg="computeBatch: waiting for inputs to be ready" batchID=1
time=2026-04-05T16:24:33.811Z level=TRACE source=runner.go:726 msg="computeBatch: logits ready" batchID=0
time=2026-04-05T16:24:33.811Z level=TRACE source=runner.go:731 msg="computeBatch: decoding" batchID=0
time=2026-04-05T16:24:33.811Z level=TRACE source=runner.go:758 msg="computeBatch: vocab details" batchID=0 seqIdx=0 len(logits)=262144 len(activeBatch.batch.Outputs)=1 vocabSize=262144 iBatches=[0]
time=2026-04-05T16:24:33.813Z level=TRACE source=bytepairencoding.go:328 msg=decoded string=<|channel> from=[100]
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:651 msg="computeBatch: outputs are ready" batchID=0
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:646 msg="computeBatch: inputs are ready" batchID=1
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:718 msg="computeBatch: signaling computeStartedCh" batchID=1
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:477 msg="forwardBatch compute started, setting up next batch" pendingBatch.id=1 id=2
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=2 seqIdx=0 seq.iBatch=0 i+1=1 len(seq.inputs)=1
time=2026-04-05T16:24:33.814Z level=TRACE source=routes.go:2440 msg="builtin parser input" parser=gemma4 content=<|channel>
time=2026-04-05T16:24:33.814Z level=TRACE source=routes.go:2467 msg="builtin parser empty output" parser=gemma4
time=2026-04-05T16:24:33.816Z level=TRACE source=runner.go:475 msg="forwardBatch waiting for compute to start" pendingBatch.id=2
time=2026-04-05T16:24:33.816Z level=TRACE source=runner.go:644 msg="computeBatch: waiting for inputs to be ready" batchID=2
time=2026-04-05T16:24:33.969Z level=TRACE source=runner.go:726 msg="computeBatch: logits ready" batchID=1
time=2026-04-05T16:24:33.970Z level=TRACE source=runner.go:731 msg="computeBatch: decoding" batchID=1
time=2026-04-05T16:24:33.970Z level=TRACE source=runner.go:758 msg="computeBatch: vocab details" batchID=1 seqIdx=0 len(logits)=262144 len(activeBatch.batch.Outputs)=1 vocabSize=262144 iBatches=[0]
time=2026-04-05T16:24:33.971Z level=TRACE source=runner.go:651 msg="computeBatch: outputs are ready" batchID=1
time=2026-04-05T16:24:33.971Z level=TRACE source=runner.go:646 msg="computeBatch: inputs are ready" batchID=2
panic: failed to sample token

goroutine 1039 [running]:
github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc0002370e0, {0x1, {0x55a60e03e910, 0xc0000ae000}, {0x55a60e04c140, 0xc000011b90}, {0xc00007e1f0, 0x1, 0x1}, {{0x55a60e04c140, ...}, ...}, ...})
	github.com/ollama/ollama/runner/ollamarunner/runner.go:762 +0x1c25
created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 12
	github.com/ollama/ollama/runner/ollamarunner/runner.go:459 +0x2cd
RAW_BUFFERClick to expand / collapse

What is the issue?

When loaded from https://huggingface.co/google/gemma-4-26B-A4B-it ( ollama create -f Modelfile-fromGemma4Modelfile ./gemma-4-26B-A4B-it gemma4:26b-a4b-bf16 ), it crashes after loading the model just after prompt input.

If the model is converted/quantized to F16 (ollama create -quantize F16 -f Modelfile-fromGemma4Modelfile ./gemma-4-26B-A4B-it gemma4:26b-a4b-f16 , it crashes too.

My inference server is an AMD Ryzen 7 7735HS with a Radeon 680M APU with 80 GB shared iGPU RAM from 96 GB main RAM, Fedora CoreOS F43, and a Podman container environment with a ROCm runtime Ollama image (all around works great! Thanks to Ollama team and its code contributors).

Ollama logs when it fails.

time=2026-04-05T16:24:03.311Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=gemma4.pooling_type default=0
time=2026-04-05T16:24:03.311Z level=TRACE source=runner.go:480 msg="forwardBatch no pending batch detected" batchID=0
time=2026-04-05T16:24:03.518Z level=INFO source=server.go:1390 msg="llama runner started in 42.20 seconds"
time=2026-04-05T16:24:03.518Z level=DEBUG source=sched.go:573 msg="finished setting up" runner.name=registry.ollama.ai/rjmalagon/gemma4:26b-a4b-bf16 runner.inference="[{ID:0 Library:ROCm}]" runner.size="68.3 GiB" runner.vram="68.3 GiB" runner.parallel=1 runner.pid=82 runner.model=/root/.ollama/models/blobs/sha256-82aee8b779e33a43efb75e4c7a4ff25ce05cd256a8cae73343f1fe6a4b5afa2f runner.num_ctx=262144
[GIN] 2026/04/05 - 16:24:03 | 200 | 42.667543514s |       127.0.0.1 | POST     "/api/generate"
time=2026-04-05T16:24:03.524Z level=DEBUG source=sched.go:581 msg="context for request finished"
time=2026-04-05T16:24:03.524Z level=DEBUG source=sched.go:309 msg="runner with non-zero duration has gone idle, adding timer" runner.name=registry.ollama.ai/rjmalagon/gemma4:26b-a4b-bf16 runner.inference="[{ID:0 Library:ROCm}]" runner.size="68.3 GiB" runner.vram="68.3 GiB" runner.parallel=1 runner.pid=82 runner.model=/root/.ollama/models/blobs/sha256-82aee8b779e33a43efb75e4c7a4ff25ce05cd256a8cae73343f1fe6a4b5afa2f runner.num_ctx=262144 duration=5m0s
time=2026-04-05T16:24:03.524Z level=DEBUG source=sched.go:327 msg="after processing request finished event" runner.name=registry.ollama.ai/rjmalagon/gemma4:26b-a4b-bf16 runner.inference="[{ID:0 Library:ROCm}]" runner.size="68.3 GiB" runner.vram="68.3 GiB" runner.parallel=1 runner.pid=82 runner.model=/root/.ollama/models/blobs/sha256-82aee8b779e33a43efb75e4c7a4ff25ce05cd256a8cae73343f1fe6a4b5afa2f runner.num_ctx=262144 refCount=0
time=2026-04-05T16:24:29.480Z level=DEBUG source=sched.go:672 msg="evaluating already loaded" model=/root/.ollama/models/blobs/sha256-82aee8b779e33a43efb75e4c7a4ff25ce05cd256a8cae73343f1fe6a4b5afa2f
time=2026-04-05T16:24:29.574Z level=TRACE source=bytepairencoding.go:287 msg=encoded string="<bos><|turn>system\n<|think|><turn|>\n<|turn>user\ntest<turn|>\n<|turn>model\n" ids="[2 105 9731 107 98 106 107 105 2364 107 2181 106 107 105 4368 107]"
time=2026-04-05T16:24:29.575Z level=DEBUG source=server.go:1538 msg="completion request" images=0 prompt=73 format=""
time=2026-04-05T16:24:29.575Z level=TRACE source=server.go:1539 msg="completion request" prompt="<bos><|turn>system\n<|think|><turn|>\n<|turn>user\ntest<turn|>\n<|turn>model\n"
time=2026-04-05T16:24:29.708Z level=TRACE source=bytepairencoding.go:287 msg=encoded string="<bos><|turn>system\n<|think|><turn|>\n<|turn>user\ntest<turn|>\n<|turn>model\n" ids="[2 105 9731 107 98 106 107 105 2364 107 2181 106 107 105 4368 107]"
time=2026-04-05T16:24:29.709Z level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=0 prompt=16 used=0 remaining=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=1 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=2 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iB_No response_atch=0 i+1=3 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=4 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=5 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=6 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=7 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=8 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=9 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=10 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=11 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=12 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=13 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=14 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=15 len(seq.inputs)=16
time=2026-04-05T16:24:29.709Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=0 seqIdx=0 seq.iBatch=0 i+1=16 len(seq.inputs)=16
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:475 msg="forwardBatch waiting for compute to start" pendingBatch.id=0
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:644 msg="computeBatch: waiting for inputs to be ready" batchID=0
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:646 msg="computeBatch: inputs are ready" batchID=0
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:718 msg="computeBatch: signaling computeStartedCh" batchID=0
time=2026-04-05T16:24:29.716Z level=TRACE source=runner.go:477 msg="forwardBatch compute started, setting up next batch" pendingBatch.id=0 id=1
time=2026-04-05T16:24:29.717Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=1 seqIdx=0 seq.iBatch=0 i+1=1 len(seq.inputs)=1
time=2026-04-05T16:24:29.719Z level=TRACE source=runner.go:475 msg="forwardBatch waiting for compute to start" pendingBatch.id=1
time=2026-04-05T16:24:29.719Z level=TRACE source=runner.go:644 msg="computeBatch: waiting for inputs to be ready" batchID=1
time=2026-04-05T16:24:33.811Z level=TRACE source=runner.go:726 msg="computeBatch: logits ready" batchID=0
time=2026-04-05T16:24:33.811Z level=TRACE source=runner.go:731 msg="computeBatch: decoding" batchID=0
time=2026-04-05T16:24:33.811Z level=TRACE source=runner.go:758 msg="computeBatch: vocab details" batchID=0 seqIdx=0 len(logits)=262144 len(activeBatch.batch.Outputs)=1 vocabSize=262144 iBatches=[0]
time=2026-04-05T16:24:33.813Z level=TRACE source=bytepairencoding.go:328 msg=decoded string=<|channel> from=[100]
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:651 msg="computeBatch: outputs are ready" batchID=0
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:646 msg="computeBatch: inputs are ready" batchID=1
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:718 msg="computeBatch: signaling computeStartedCh" batchID=1
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:477 msg="forwardBatch compute started, setting up next batch" pendingBatch.id=1 id=2
time=2026-04-05T16:24:33.814Z level=TRACE source=runner.go:592 msg="forwardBatch iBatch" batchID=2 seqIdx=0 seq.iBatch=0 i+1=1 len(seq.inputs)=1
time=2026-04-05T16:24:33.814Z level=TRACE source=routes.go:2440 msg="builtin parser input" parser=gemma4 content=<|channel>
time=2026-04-05T16:24:33.814Z level=TRACE source=routes.go:2467 msg="builtin parser empty output" parser=gemma4
time=2026-04-05T16:24:33.816Z level=TRACE source=runner.go:475 msg="forwardBatch waiting for compute to start" pendingBatch.id=2
time=2026-04-05T16:24:33.816Z level=TRACE source=runner.go:644 msg="computeBatch: waiting for inputs to be ready" batchID=2
time=2026-04-05T16:24:33.969Z level=TRACE source=runner.go:726 msg="computeBatch: logits ready" batchID=1
time=2026-04-05T16:24:33.970Z level=TRACE source=runner.go:731 msg="computeBatch: decoding" batchID=1
time=2026-04-05T16:24:33.970Z level=TRACE source=runner.go:758 msg="computeBatch: vocab details" batchID=1 seqIdx=0 len(logits)=262144 len(activeBatch.batch.Outputs)=1 vocabSize=262144 iBatches=[0]
time=2026-04-05T16:24:33.971Z level=TRACE source=runner.go:651 msg="computeBatch: outputs are ready" batchID=1
time=2026-04-05T16:24:33.971Z level=TRACE source=runner.go:646 msg="computeBatch: inputs are ready" batchID=2
panic: failed to sample token

goroutine 1039 [running]:
github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc0002370e0, {0x1, {0x55a60e03e910, 0xc0000ae000}, {0x55a60e04c140, 0xc000011b90}, {0xc00007e1f0, 0x1, 0x1}, {{0x55a60e04c140, ...}, ...}, ...})
	github.com/ollama/ollama/runner/ollamarunner/runner.go:762 +0x1c25
created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 12
	github.com/ollama/ollama/runner/ollamarunner/runner.go:459 +0x2cd

OS

Fedora CoreOS F43 (Linux version 6.19.7-200.fc43.x86_64)

GPU

Radeon 680m

CPU

AMD Ryzen 7 7735HS

Ollama version

Ollama 0.20.2 (Docker built)

extent analysis

TL;DR

The issue is likely related to the model's compatibility or the system's configuration, causing a panic when trying to sample a token, and a workaround might involve adjusting the model loading or quantization process.

Guidance

  • Verify that the model is compatible with the Ollama version (0.20.2) and the ROCm runtime environment.
  • Check the system's GPU and CPU resources to ensure they are sufficient for the model's requirements.
  • Try adjusting the quantization process or loading the model without quantization to see if it resolves the issue.
  • Review the Ollama logs for any specific error messages or warnings that might indicate the root cause of the problem.

Example

No specific code example is provided, as the issue seems to be related to the system configuration or model compatibility rather than a code-specific problem.

Notes

The provided logs indicate a panic when trying to sample a token, which might be related to the model's configuration or the system's resources. Further investigation is needed to determine the root cause of the issue.

Recommendation

Apply a workaround by adjusting the model loading or quantization process, as the issue might be related to the model's compatibility or the system's configuration.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix Gemma4 it 26B-4AB default conversion from original tensorfiles crash at inference. [3 comments, 2 participants]