ollama - 💡(How to fix) Fix mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration [1 comments, 2 participants]

ollama2026-04-03 22:05:47

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15307•Fetched 2026-04-08 02:44:22

View on GitHub

Comments

Participants

Timeline

Reactions

Author

chigkim

Participants

chigkim

rick-github

Timeline (top)

closed ×1commented ×1labeled ×1

Error Message

GOMAXPROCS=1 ollama serve ollama create gemma4:26b-a4b-heretic-mxfp8 --experimental -f gemma4-26b.modelfile -q mxfp8 importing safetensors model importing safetensors model importing model-00001-of-00006.safetensors (112 tensors, quantizing to mxfp8) importing model-00002-of-00006.safetensors (131 tensors, quantizing to mxfp8) importing model-00003-of-00006.safetensors (131 tensors, quantizing to mxfp8) importing model-00004-of-00006.safetensors (131 tensors, quantizing to mxfp8) importing model-00005-of-00006.safetensors (131 tensors, quantizing to mxfp8) importing model-00006-of-00006.safetensors (377 tensors, quantizing to mxfp8) importing config config.json importing config generation_config.json importing config processor_config.json importing config tokenizer.json importing config tokenizer_config.json writing manifest for gemma4:26b-a4b-heretic-mxfp8 successfully imported gemma4:26b-a4b-heretic-mxfp8 with 1018 layers

ollama run gemma4:26b-a4b-heretic-mxfp8 Error: 500 Internal Server Error: mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration (exit: exit status 1)

Code Example

GOMAXPROCS=1 ollama serve
ollama create gemma4:26b-a4b-heretic-mxfp8 --experimental -f gemma4-26b.modelfile -q mxfp8
importing safetensors model
importing safetensors model
importing model-00001-of-00006.safetensors (112 tensors, quantizing to mxfp8)
importing model-00002-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00003-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00004-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00005-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00006-of-00006.safetensors (377 tensors, quantizing to mxfp8)
importing config config.json
importing config generation_config.json
importing config processor_config.json
importing config tokenizer.json
importing config tokenizer_config.json
writing manifest for gemma4:26b-a4b-heretic-mxfp8
successfully imported gemma4:26b-a4b-heretic-mxfp8 with 1018 layers

ollama run gemma4:26b-a4b-heretic-mxfp8
Error: 500 Internal Server Error: mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration (exit: exit status 1)

RAW_BUFFERClick to expand / collapse

What is the issue?

When I import a finetuned Gemma4 to mxfp8, it says successfully imported gemma4:26b-a4b-heretic-mxfp8 with 1018 layers. However I get this error when I try to use it. Error: 500 Internal Server Error: mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration (exit: exit status 1)

Relevant log output

GOMAXPROCS=1 ollama serve
ollama create gemma4:26b-a4b-heretic-mxfp8 --experimental -f gemma4-26b.modelfile -q mxfp8
importing safetensors model
importing safetensors model
importing model-00001-of-00006.safetensors (112 tensors, quantizing to mxfp8)
importing model-00002-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00003-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00004-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00005-of-00006.safetensors (131 tensors, quantizing to mxfp8)
importing model-00006-of-00006.safetensors (377 tensors, quantizing to mxfp8)
importing config config.json
importing config generation_config.json
importing config processor_config.json
importing config tokenizer.json
importing config tokenizer_config.json
writing manifest for gemma4:26b-a4b-heretic-mxfp8
successfully imported gemma4:26b-a4b-heretic-mxfp8 with 1018 layers

ollama run gemma4:26b-a4b-heretic-mxfp8
Error: 500 Internal Server Error: mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration (exit: exit status 1)

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.20.0

extent analysis

TL;DR

The error message indicates that the Gemma4ForConditionalGeneration architecture is not supported, suggesting a potential compatibility issue between the imported model and the ollama version.

Guidance

Verify that the ollama version (0.20.0) supports the Gemma4ForConditionalGeneration architecture.
Check the documentation for ollama to see if there are any specific requirements or compatibility issues with the Gemma4 model.
Consider checking for updates to ollama or seeking guidance from the ollama community to resolve the compatibility issue.
Review the import process to ensure that the model was imported correctly and that all necessary files were included.

Notes

The issue may be specific to the combination of ollama version, model architecture, and hardware (Apple GPU and CPU).

Recommendation

Apply workaround: Try to find an alternative model architecture that is supported by the current ollama version, or seek guidance from the ollama community to resolve the compatibility issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#generation error #database connection #vector store #embedding generation #cache error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix mlx runner failed: Error: unsupported architecture: Gemma4ForConditionalGeneration [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING