ollama - 💡(How to fix) Fix mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15307Fetched 2026-04-08 02:44:22
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
closed ×1commented ×1labeled ×1

Error Message

GOMAXPROCS=1 ollama serve ollama create gemma4:26b-a4b-heretic-mxfp8 --experimental -f gemma4-26b.modelfile -q mxfp8 importing safetensors model importing safetensors model importing model-00001-of-00006.safetensors (112 tensors, quantizing to mxfp8) importing model-00002-of-00006.safetensors (131 tensors, quantizing to mxfp8) importing model-00003-of-00006.safetensors (131 tensors, quantizing to mxfp8) importing model-00004-of-00006.safetensors (131 tensors, quantizing to mxfp8) importing model-00005-of-00006.safetensors (131 tensors, quantizing to mxfp8) importing model-00006-of-00006.safetensors (377 tensors, quantizing to mxfp8) importing config config.json importing config generation_config.json importing config processor_config.json importing config tokenizer.json importing config tokenizer_config.json writing manifest for gemma4:26b-a4b-heretic-mxfp8 successfully imported gemma4:26b-a4b-heretic-mxfp8 with 1018 layers

ollama run gemma4:26b-a4b-heretic-mxfp8 Error: 500 Internal Server Error: mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration (exit: exit status 1)

Code Example

GOMAXPROCS=1 ollama serve
ollama create gemma4:26b-a4b-heretic-mxfp8 --experimental -f gemma4-26b.modelfile -q mxfp8
importing safetensors model
importing safetensors model
importing model-00001-of-00006.safetensors (112 tensors, quantizing to mxfp8)
importing model-00002-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00003-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00004-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00005-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00006-of-00006.safetensors (377 tensors, quantizing to mxfp8)
importing config config.json
importing config generation_config.json
importing config processor_config.json
importing config tokenizer.json
importing config tokenizer_config.json
writing manifest for gemma4:26b-a4b-heretic-mxfp8
successfully imported gemma4:26b-a4b-heretic-mxfp8 with 1018 layers

ollama run gemma4:26b-a4b-heretic-mxfp8
Error: 500 Internal Server Error: mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration (exit: exit status 1)
RAW_BUFFERClick to expand / collapse

What is the issue?

When I import a finetuned Gemma4 to mxfp8, it says successfully imported gemma4:26b-a4b-heretic-mxfp8 with 1018 layers. However I get this error when I try to use it. Error: 500 Internal Server Error: mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration (exit: exit status 1)

Relevant log output

GOMAXPROCS=1 ollama serve
ollama create gemma4:26b-a4b-heretic-mxfp8 --experimental -f gemma4-26b.modelfile -q mxfp8
importing safetensors model
importing safetensors model
importing model-00001-of-00006.safetensors (112 tensors, quantizing to mxfp8)
importing model-00002-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00003-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00004-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00005-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00006-of-00006.safetensors (377 tensors, quantizing to mxfp8)
importing config config.json
importing config generation_config.json
importing config processor_config.json
importing config tokenizer.json
importing config tokenizer_config.json
writing manifest for gemma4:26b-a4b-heretic-mxfp8
successfully imported gemma4:26b-a4b-heretic-mxfp8 with 1018 layers

ollama run gemma4:26b-a4b-heretic-mxfp8
Error: 500 Internal Server Error: mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration (exit: exit status 1)

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.20.0

extent analysis

TL;DR

The error message indicates that the Gemma4ForConditionalGeneration architecture is not supported, suggesting a potential compatibility issue between the imported model and the ollama version.

Guidance

  • Verify that the ollama version (0.20.0) supports the Gemma4ForConditionalGeneration architecture.
  • Check the documentation for ollama to see if there are any specific requirements or compatibility issues with the Gemma4 model.
  • Consider checking for updates to ollama or seeking guidance from the ollama community to resolve the compatibility issue.
  • Review the import process to ensure that the model was imported correctly and that all necessary files were included.

Notes

The issue may be specific to the combination of ollama version, model architecture, and hardware (Apple GPU and CPU).

Recommendation

Apply workaround: Try to find an alternative model architecture that is supported by the current ollama version, or seek guidance from the ollama community to resolve the compatibility issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING