ollama - 💡(How to fix) Fix gemma4:e2b and gemma4:e4b fail to load on Apple M5 (exit status 2 crash)

ollama2026-05-12 16:44:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

Error Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error llama runner terminated error="exit status 2" do load request error="Post http://127.0.0.1:51189/load: EOF"

Code Example

macOS: 15.0 (Darwin 25.0.0)
Chip: Apple M5
RAM: 16 GB unified memory
Ollama version: 0.23.0

---

gemma4:e2b (7.2 GB)
gemma4:e4b (9.6 GB)
gemma3:4b works fine

---

ollama pull gemma4:e2b
ollama run gemma4:e2b "say hello"
# Also tried:
OLLAMA_NUM_GPU=0 ollama run gemma4:e2b "say hello"
OLLAMA_FLASH_ATTENTION=0 ollama run gemma4:e2b "say hello"

---

llama runner terminated  error="exit status 2"
fault   0x19eae65b0
pc      0x19eae65b0
do load request  error="Post http://127.0.0.1:51189/load: EOF"
Load failed: model failed to load

RAW_BUFFERClick to expand / collapse

Environment

macOS: 15.0 (Darwin 25.0.0)
Chip: Apple M5
RAM: 16 GB unified memory
Ollama version: 0.23.0

Models affected

gemma4:e2b (7.2 GB)
gemma4:e4b (9.6 GB)
gemma3:4b works fine

Steps to reproduce

ollama pull gemma4:e2b
ollama run gemma4:e2b "say hello"
# Also tried:
OLLAMA_NUM_GPU=0 ollama run gemma4:e2b "say hello"
OLLAMA_FLASH_ATTENTION=0 ollama run gemma4:e2b "say hello"

Error

Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error

Server log

llama runner terminated  error="exit status 2"
fault   0x19eae65b0
pc      0x19eae65b0
do load request  error="Post http://127.0.0.1:51189/load: EOF"
Load failed: model failed to load

Notes

Crash happens consistently at the same fault address regardless of GPU/CPU mode gemma3:4b loads and runs perfectly on the same machine Gemma 4 runs correctly via Google AI Studio API — this is specific to the Ollama llama runner M5 chip is new (2025)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #memory leak #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix gemma4:e2b and gemma4:e4b fail to load on Apple M5 (exit status 2 crash)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix gemma4:e2b and gemma4:e4b fail to load on Apple M5 (exit status 2 crash)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Still need to ship something?

RELATED_DISCOVERY

TRENDING