ollama - ✅(Solved) Fix Gemma 4 E2B fails to load on Apple M5 — Metal library compilation error [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15548Fetched 2026-04-15 06:20:15
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
0
Author
Participants
Assignees
Timeline (top)
cross-referenced ×2referenced ×2assigned ×1closed ×1

Error Message

ggml_metal_device_init: simdgroup reduction = true ggml_metal_device_init: simdgroup matrix mul. = true ggml_metal_device_init: has unified memory = true ggml_metal_device_init: has bfloat = true ggml_metal_device_init: has tensor = true ggml_metal_device_init: use residency sets = true ggml_metal_device_init: use shared buffers = true ggml_metal_device_init: recommendedMaxWorkingSetSize = 19069.67 MB ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M5 ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected ggml_metal_init: will try to compile it on the fly ggml_metal_library_init: using embedded metal library ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:7131:28: warning: variable 'theta_base' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] ggml_metal_init: error: failed to initialize the Metal library 2 ollama 0x0000000105bfe544 ggml_backend_get_default_buffer_type + 76 3 ollama 0x0000000105b7a838 _cgo_c81fd19bee02_Cfunc_ggml_backend_get_default_buffer_type + 36 SIGABRT: abort github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_get_default_buffer_type(0x0) fault 0x1836b95b0

Fix Action

Fixed

PR fix notes

PR #15550: gemma4: fix compiler error on metal

Description (problem / solution / changelog)

On some systems, the metal runtime compiler is failing due to an uninitialized variable from #15378.

Fixes #15548

Changed files

  • llama/patches/0027-interleave-multi-rope.patch (modified, +11/-8)
  • llama/patches/0032-ggml-enable-MLA-flash-attention-for-GLM-4.7-flash.patch (modified, +1/-1)
  • llama/patches/0033-ggml-metal-solve_tri.patch (modified, +1/-1)
  • llama/patches/0034-ggml-metal-guard-mul_mat_id-map0-and-add-ne20-22-spe.patch (modified, +1/-1)
  • llama/patches/0036-backport-kernels-for-gemma4.patch (modified, +1/-1)
  • ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.metal (modified, +1/-1)
  • ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.metal (modified, +1/-1)

PR #15551: app: restore macOS runtime env for managed server

Description (problem / solution / changelog)

Summary

Restore selected runtime environment variables from launchctl getenv when the macOS desktop app launches a managed ollama serve process.

On Apple Silicon macOS setups, users can configure runtime workarounds such as:

  • GGML_METAL_TENSOR_DISABLE=1
  • OLLAMA_KV_CACHE_TYPE=q8_0
  • OLLAMA_FLASH_ATTENTION=1

These may be present in the user's launchd session, but not in the desktop app process environment. In that case the managed serve child never sees them, even though server.cmd() already copies os.Environ().

This patch adds a small darwin-only fallback: for a narrow allowlist of runtime env vars, if a key is missing from the app process env, read it from launchctl getenv and inject it into the managed server env.

Why this change

I reproduced a case on Apple M5 Pro + macOS 26.4 + Ollama 0.20.6 where:

  • launchctl print gui/$(id -u)/com.ollama.ollama showed GGML_METAL_TENSOR_DISABLE=1
  • but /Applications/Ollama.app/Contents/Resources/ollama serve did not have that env in ps eww
  • and model loads failed with Metal compile errors and 500

Typical failing log sequence:

  • ggml_metal_device_init: has tensor = true
  • ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
  • ggml_metal_init: will try to compile it on the fly
  • MPPTensorOpsMatMul2dImpl.h ... static_assert failed due to requirement '__is_same_v<bfloat, half>'
  • SIGABRT: abort

With GGML_METAL_TENSOR_DISABLE=1, the same machine can load the model successfully.

Scope

This intentionally does not introduce generic env passthrough logic.

It only restores a small allowlist of documented runtime env vars plus GGML_METAL_TENSOR_DISABLE, and only when the variable is absent from the current process env.

Explicit process env still wins.

Tests

Added coverage for:

  • filling missing allowlisted variables from launchctl
  • preserving explicit process env over launchctl values
  • ignoring missing/empty launchctl values
  • ensuring Server.cmd() includes recovered values in cmd.Env

Verification

Passed locally:

go test ./app/server

Related

  • Closes or helps with: #15548

Changed files

  • app/server/server.go (modified, +1/-0)
  • app/server/server_darwin_env.go (added, +58/-0)
  • app/server/server_test.go (modified, +129/-0)
  • app/server/server_windows_env.go (added, +5/-0)

Code Example

ggml_metal_device_init: simdgroup reduction   = true
ggml_metal_device_init: simdgroup matrix mul. = true
ggml_metal_device_init: has unified memory    = true
ggml_metal_device_init: has bfloat            = true
ggml_metal_device_init: has tensor            = true
ggml_metal_device_init: use residency sets    = true
ggml_metal_device_init: use shared buffers    = true
ggml_metal_device_init: recommendedMaxWorkingSetSize  = 19069.67 MB
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M5
ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
ggml_metal_init: will try to compile it on the fly
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:7131:28: warning: variable 'theta_base' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
ggml_metal_init: error: failed to initialize the Metal library
2   ollama                              0x0000000105bfe544 ggml_backend_get_default_buffer_type + 76
3   ollama                              0x0000000105b7a838 _cgo_c81fd19bee02_Cfunc_ggml_backend_get_default_buffer_type + 36
SIGABRT: abort
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_get_default_buffer_type(0x0)
fault   0x1836b95b0
RAW_BUFFERClick to expand / collapse

What is the issue?

Chip: Apple M5 Ollama model: gemma4:e2b Running ollama run gemma4:e2b results in a 500 Internal Server Error. The model never loads.

❯ ollama run gemma3:4b Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details

Relevant log output

ggml_metal_device_init: simdgroup reduction   = true
ggml_metal_device_init: simdgroup matrix mul. = true
ggml_metal_device_init: has unified memory    = true
ggml_metal_device_init: has bfloat            = true
ggml_metal_device_init: has tensor            = true
ggml_metal_device_init: use residency sets    = true
ggml_metal_device_init: use shared buffers    = true
ggml_metal_device_init: recommendedMaxWorkingSetSize  = 19069.67 MB
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M5
ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
ggml_metal_init: will try to compile it on the fly
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:7131:28: warning: variable 'theta_base' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
ggml_metal_init: error: failed to initialize the Metal library
2   ollama                              0x0000000105bfe544 ggml_backend_get_default_buffer_type + 76
3   ollama                              0x0000000105b7a838 _cgo_c81fd19bee02_Cfunc_ggml_backend_get_default_buffer_type + 36
SIGABRT: abort
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_get_default_buffer_type(0x0)
fault   0x1836b95b0

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.20.6

extent analysis

TL;DR

The issue is likely due to a Metal library compilation error, and a potential workaround is to update the Ollama version or modify the Metal library to fix the uninitialized variable warning.

Guidance

  • The error message indicates a problem with the Metal library initialization, specifically a warning about an uninitialized variable theta_base.
  • The ggml_metal_init function is trying to compile the Metal library on the fly, which is failing due to the warning.
  • To mitigate this issue, you can try updating the Ollama version to a newer release that may include fixes for this problem.
  • You can also try to modify the Metal library to initialize the theta_base variable before using it, although this may require modifying the Ollama source code.

Example

No code snippet is provided as it would require modifying the Ollama source code, which is not recommended without further investigation.

Notes

The provided log output suggests that the issue is specific to the Apple M5 chip and the gemma4:e2b model. The error message indicates a problem with the Metal library, but the root cause is not entirely clear.

Recommendation

Apply workaround: The best course of action is to update the Ollama version to a newer release that may include fixes for this problem, as modifying the Metal library may require significant changes to the Ollama source code.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING