ollama - 💡(How to fix) Fix Ollama 0.21.0 fails to initialize Metal on Apple M5 / macOS 26.2 while 0.18.0 works [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15748Fetched 2026-04-23 07:23:25
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

ollama 0.21.0 fails to load local models on Apple M5 with macOS 26.2 because Metal initialization crashes during embedded Metal library compilation. Rolling back to 0.18.0 fixes the issue immediately on the same machine with the same models.

Error Message

ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected ggml_metal_library_init: using embedded metal library ... MPPTensorOpsMatMul2dImpl.h ... error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<bfloat, half>' "Input types must match cooperative tensor types" error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<half, bfloat>' "Input types must match cooperative tensor types" ggml_metal_init: error: failed to initialize the Metal library ggml_backend_metal_device_init: error: failed to allocate context llama_init_from_model: failed to initialize the context: failed to initialize Metal backend panic: unable to create llama context

Root Cause

ollama 0.21.0 fails to load local models on Apple M5 with macOS 26.2 because Metal initialization crashes during embedded Metal library compilation. Rolling back to 0.18.0 fixes the issue immediately on the same machine with the same models.

Fix Action

Fix / Workaround

  • This is not model-specific; it reproduced on at least llama3.2:latest
  • It is not fixed by trying OLLAMA_LLM_LIBRARY=cpu; Metal initialization still occurs in 0.21.0
  • The issue appears specific to the newer runtime on Apple M5 / current Metal toolchain

Code Example

ollama run "llama3.2:latest" "Return exactly this JSON and nothing else: {\"ok\":true}" --format json --nowordwrap

---

ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
ggml_metal_library_init: using embedded metal library
... MPPTensorOpsMatMul2dImpl.h ...
error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<bfloat, half>' "Input types must match cooperative tensor types"
error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<half, bfloat>' "Input types must match cooperative tensor types"
ggml_metal_init: error: failed to initialize the Metal library
ggml_backend_metal_device_init: error: failed to allocate context
llama_init_from_model: failed to initialize the context: failed to initialize Metal backend
panic: unable to create llama context
RAW_BUFFERClick to expand / collapse

Summary

ollama 0.21.0 fails to load local models on Apple M5 with macOS 26.2 because Metal initialization crashes during embedded Metal library compilation. Rolling back to 0.18.0 fixes the issue immediately on the same machine with the same models.

Environment

  • Ollama 0.21.0
  • macOS 26.2
  • Darwin 25.2.0
  • Apple M5
  • Mac17,2

Regression

This machine works again after downgrading to Ollama 0.18.0.

  • 0.21.0: local model load fails
  • 0.18.0: ollama run llama3.2:latest ... succeeds and loads on GPU

Repro

ollama run "llama3.2:latest" "Return exactly this JSON and nothing else: {\"ok\":true}" --format json --nowordwrap

Actual result on 0.21.0

Model load fails with Metal initialization errors.

From ~/.ollama/logs/server.log:

ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected
ggml_metal_library_init: using embedded metal library
... MPPTensorOpsMatMul2dImpl.h ...
error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<bfloat, half>' "Input types must match cooperative tensor types"
error: static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<half, bfloat>' "Input types must match cooperative tensor types"
ggml_metal_init: error: failed to initialize the Metal library
ggml_backend_metal_device_init: error: failed to allocate context
llama_init_from_model: failed to initialize the context: failed to initialize Metal backend
panic: unable to create llama context

Expected result

The model should load and run locally, as it does on 0.18.0 on the same machine.

Additional notes

  • This is not model-specific; it reproduced on at least llama3.2:latest
  • It is not fixed by trying OLLAMA_LLM_LIBRARY=cpu; Metal initialization still occurs in 0.21.0
  • The issue appears specific to the newer runtime on Apple M5 / current Metal toolchain

extent analysis

TL;DR

Downgrade to Ollama 0.18.0 to fix the local model loading issue on Apple M5 with macOS 26.2.

Guidance

  • The issue is likely caused by a compatibility problem between Ollama 0.21.0 and the Metal initialization on Apple M5 with macOS 26.2.
  • To verify the issue, run the repro command ollama run "llama3.2:latest" "Return exactly this JSON and nothing else: {\"ok\":true}" --format json --nowordwrap and check the ~/.ollama/logs/server.log for Metal initialization errors.
  • Downgrading to Ollama 0.18.0 has been confirmed to fix the issue, so this can be used as a temporary workaround.
  • The issue appears to be specific to the newer runtime on Apple M5 and the current Metal toolchain, so it may be worth monitoring for updates to Ollama or the Metal toolchain that could resolve the issue.

Notes

The root cause of the issue is not explicitly stated, but it appears to be related to changes in the Metal initialization between Ollama 0.18.0 and 0.21.0.

Recommendation

Apply workaround: Downgrade to Ollama 0.18.0, as this has been confirmed to fix the issue and allows for local model loading to work as expected.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING