ollama - ✅(Solved) Fix Gemma 4 E2B fails to load on Apple M5 — Metal library compilation error [2 pull requests, 1 comments, 2 participants]

ollama2026-04-13 15:55:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15548•Fetched 2026-04-15 06:20:15

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Assignees

Timeline (top)

cross-referenced ×2referenced ×2assigned ×1closed ×1

Error Message

ggml_metal_device_init: simdgroup reduction = true ggml_metal_device_init: simdgroup matrix mul. = true ggml_metal_device_init: has unified memory = true ggml_metal_device_init: has bfloat = true ggml_metal_device_init: has tensor = true ggml_metal_device_init: use residency sets = true ggml_metal_device_init: use shared buffers = true ggml_metal_device_init: recommendedMaxWorkingSetSize = 19069.67 MB ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M5 ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected ggml_metal_init: will try to compile it on the fly ggml_metal_library_init: using embedded metal library ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:7131:28: warning: variable 'theta_base' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] ggml_metal_init: error: failed to initialize the Metal library 2 ollama 0x0000000105bfe544 ggml_backend_get_default_buffer_type + 76 3 ollama 0x0000000105b7a838 _cgo_c81fd19bee02_Cfunc_ggml_backend_get_default_buffer_type + 36 SIGABRT: abort github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_get_default_buffer_type(0x0) fault 0x1836b95b0

Fix Action

Fixed

Fixed by PR: gemma4: fix compiler error on metal (https://github.com/ollama/ollama/pull/15550)
Fixed by PR: app: restore macOS runtime env for managed server (https://github.com/ollama/ollama/pull/15551)

PR fix notes

PR #15550: gemma4: fix compiler error on metal

Repository: ollama/ollama
Author: dhiltgen
State: closed | merged: True
Link: https://github.com/ollama/ollama/pull/15550

Description (problem / solution / changelog)

On some systems, the metal runtime compiler is failing due to an uninitialized variable from #15378.

Fixes #15548

Changed files

llama/patches/0027-interleave-multi-rope.patch (modified, +11/-8)
llama/patches/0032-ggml-enable-MLA-flash-attention-for-GLM-4.7-flash.patch (modified, +1/-1)
llama/patches/0033-ggml-metal-solve_tri.patch (modified, +1/-1)
llama/patches/0034-ggml-metal-guard-mul_mat_id-map0-and-add-ne20-22-spe.patch (modified, +1/-1)
llama/patches/0036-backport-kernels-for-gemma4.patch (modified, +1/-1)
ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.metal (modified, +1/-1)
ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.metal (modified, +1/-1)

PR #15551: app: restore macOS runtime env for managed server

Repository: ollama/ollama
Author: Horacehxw
State: open | merged: False
Link: https://github.com/ollama/ollama/pull/15551

Description (problem / solution / changelog)

Summary

Restore selected runtime environment variables from launchctl getenv when the macOS desktop app launches a managed ollama serve process.

On Apple Silicon macOS setups, users can configure runtime workarounds such as:

GGML_METAL_TENSOR_DISABLE=1
OLLAMA_KV_CACHE_TYPE=q8_0
OLLAMA_FLASH_ATTENTION=1

These may be present in the user's launchd session, but not in the desktop app process environment. In that case the managed serve child never sees them, even though server.cmd() already copies os.Environ().

This patch adds a small darwin-only fallback: for a narrow allowlist of runtime env vars, if a key is missing from the app process env, read it from launchctl getenv and inject it into the managed server env.

Why this change

I reproduced a case on Apple M5 Pro + macOS 26.4 + Ollama 0.20.6 where:

launchctl print gui/$(id -u)/com.ollama.ollama showed GGML_METAL_TENSOR_DISABLE=1
but /Applications/Ollama.app/Contents/Resources/ollama serve did not have that env in ps eww
and model loads failed with Metal compile errors and 500

Typical failing log sequence:

ggml_metal_device_init: has tensor = true
ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
ggml_metal_init: will try to compile it on the fly
MPPTensorOpsMatMul2dImpl.h ... static_assert failed due to requirement '__is_same_v<bfloat, half>'
SIGABRT: abort

With GGML_METAL_TENSOR_DISABLE=1, the same machine can load the model successfully.

Scope

This intentionally does not introduce generic env passthrough logic.

It only restores a small allowlist of documented runtime env vars plus GGML_METAL_TENSOR_DISABLE, and only when the variable is absent from the current process env.

Explicit process env still wins.

Tests

Added coverage for:

filling missing allowlisted variables from launchctl
preserving explicit process env over launchctl values
ignoring missing/empty launchctl values
ensuring Server.cmd() includes recovered values in cmd.Env

Verification

Passed locally:

go test ./app/server

Closes or helps with: #15548

Changed files

app/server/server.go (modified, +1/-0)
app/server/server_darwin_env.go (added, +58/-0)
app/server/server_test.go (modified, +129/-0)
app/server/server_windows_env.go (added, +5/-0)

Code Example

ggml_metal_device_init: simdgroup reduction   = true
ggml_metal_device_init: simdgroup matrix mul. = true
ggml_metal_device_init: has unified memory    = true
ggml_metal_device_init: has bfloat            = true
ggml_metal_device_init: has tensor            = true
ggml_metal_device_init: use residency sets    = true
ggml_metal_device_init: use shared buffers    = true
ggml_metal_device_init: recommendedMaxWorkingSetSize  = 19069.67 MB
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M5
ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
ggml_metal_init: will try to compile it on the fly
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:7131:28: warning: variable 'theta_base' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
ggml_metal_init: error: failed to initialize the Metal library
2   ollama                              0x0000000105bfe544 ggml_backend_get_default_buffer_type + 76
3   ollama                              0x0000000105b7a838 _cgo_c81fd19bee02_Cfunc_ggml_backend_get_default_buffer_type + 36
SIGABRT: abort
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_get_default_buffer_type(0x0)
fault   0x1836b95b0

RAW_BUFFERClick to expand / collapse

What is the issue?

Chip: Apple M5 Ollama model: gemma4:e2b Running ollama run gemma4:e2b results in a 500 Internal Server Error. The model never loads.

❯ ollama run gemma3:4b Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details

Relevant log output

ggml_metal_device_init: simdgroup reduction   = true
ggml_metal_device_init: simdgroup matrix mul. = true
ggml_metal_device_init: has unified memory    = true
ggml_metal_device_init: has bfloat            = true
ggml_metal_device_init: has tensor            = true
ggml_metal_device_init: use residency sets    = true
ggml_metal_device_init: use shared buffers    = true
ggml_metal_device_init: recommendedMaxWorkingSetSize  = 19069.67 MB
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M5
ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
ggml_metal_init: will try to compile it on the fly
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:7131:28: warning: variable 'theta_base' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
ggml_metal_init: error: failed to initialize the Metal library
2   ollama                              0x0000000105bfe544 ggml_backend_get_default_buffer_type + 76
3   ollama                              0x0000000105b7a838 _cgo_c81fd19bee02_Cfunc_ggml_backend_get_default_buffer_type + 36
SIGABRT: abort
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_get_default_buffer_type(0x0)
fault   0x1836b95b0

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.20.6

extent analysis

TL;DR

The issue is likely due to a Metal library compilation error, and a potential workaround is to update the Ollama version or modify the Metal library to fix the uninitialized variable warning.

Guidance

The error message indicates a problem with the Metal library initialization, specifically a warning about an uninitialized variable theta_base.
The ggml_metal_init function is trying to compile the Metal library on the fly, which is failing due to the warning.
To mitigate this issue, you can try updating the Ollama version to a newer release that may include fixes for this problem.
You can also try to modify the Metal library to initialize the theta_base variable before using it, although this may require modifying the Ollama source code.

Example

No code snippet is provided as it would require modifying the Ollama source code, which is not recommended without further investigation.

Notes

The provided log output suggests that the issue is specific to the Apple M5 chip and the gemma4:e2b model. The error message indicates a problem with the Metal library, but the root cause is not entirely clear.

Recommendation

Apply workaround: The best course of action is to update the Ollama version to a newer release that may include fixes for this problem, as modifying the Metal library may require significant changes to the Ollama source code.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#GPU compatibility #latency issue #model loading #dependency error #configuration error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

ollama - ✅(Solved) Fix Gemma 4 E2B fails to load on Apple M5 — Metal library compilation error [2 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #15550: gemma4: fix compiler error on metal

Description (problem / solution / changelog)

Changed files

PR #15551: app: restore macOS runtime env for managed server

Description (problem / solution / changelog)

Summary

Why this change

Scope

Tests

Verification

Related

Changed files

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING