ollama - 💡(How to fix) Fix gemma4:e2b and gemma4:e4b fail to load on Apple M5 (exit status 2 crash)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Error Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error llama runner terminated error="exit status 2" do load request error="Post http://127.0.0.1:51189/load: EOF"

Code Example

macOS: 15.0 (Darwin 25.0.0)
Chip: Apple M5
RAM: 16 GB unified memory
Ollama version: 0.23.0

---

gemma4:e2b (7.2 GB)
gemma4:e4b (9.6 GB)
gemma3:4b works fine

---

ollama pull gemma4:e2b
ollama run gemma4:e2b "say hello"
# Also tried:
OLLAMA_NUM_GPU=0 ollama run gemma4:e2b "say hello"
OLLAMA_FLASH_ATTENTION=0 ollama run gemma4:e2b "say hello"

---

llama runner terminated  error="exit status 2"
fault   0x19eae65b0
pc      0x19eae65b0
do load request  error="Post http://127.0.0.1:51189/load: EOF"
Load failed: model failed to load
RAW_BUFFERClick to expand / collapse

Environment

macOS: 15.0 (Darwin 25.0.0)
Chip: Apple M5
RAM: 16 GB unified memory
Ollama version: 0.23.0

Models affected

gemma4:e2b (7.2 GB)
gemma4:e4b (9.6 GB)
gemma3:4b works fine

Steps to reproduce

ollama pull gemma4:e2b
ollama run gemma4:e2b "say hello"
# Also tried:
OLLAMA_NUM_GPU=0 ollama run gemma4:e2b "say hello"
OLLAMA_FLASH_ATTENTION=0 ollama run gemma4:e2b "say hello"

Error

Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error

Server log

llama runner terminated  error="exit status 2"
fault   0x19eae65b0
pc      0x19eae65b0
do load request  error="Post http://127.0.0.1:51189/load: EOF"
Load failed: model failed to load

Notes

Crash happens consistently at the same fault address regardless of GPU/CPU mode gemma3:4b loads and runs perfectly on the same machine Gemma 4 runs correctly via Google AI Studio API — this is specific to the Ollama llama runner M5 chip is new (2025)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING