ollama - 💡(How to fix) Fix Metal backend crash on Apple M5 - bfloat/half type mismatch in MPPTensorOpsMatMul2d (v0.21.2) [1 participants]

ollama2026-04-25 16:34:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15813•Fetched 2026-04-26 05:06:04

View on GitHub

Comments

Participants

Timeline

Reactions

Author

egalarza007

Participants

egalarza007

Ollama fails to run any model on Apple M5 hardware. The Metal shader compilation fails with a bfloat/half type mismatch in MetalPerformancePrimitives.

Error Message

From ~/.ollama/logs/server.log:

ggml_metal_init: the device does not have a precompiled Metal library - this is unexpected ggml_metal_init: will try to compile it on the fly ggml_metal_library_init: error: Error Domain=MTLLibraryErrorDomain Code=3 static_assert failed due to requirement '__tensor_ops_detail::__is_same_v<bfloat, half>' "Input types must match cooperative tensor types" ggml_metal_init: error: failed to initialize the Metal library ggml_backend_metal_device_init: error: failed to allocate context llama_init_from_model: failed to initialize the context: failed to initialize Metal backend panic: unable to create llama context

The server reports the GPU as MTLGPUFamilyApple10 (1010) which is M5-specific. The embedded Metal library does not have a precompiled library for this GPU family, and on-the-fly compilation fails due to a bfloat16/half type mismatch in the MetalPerformancePrimitives framework.

Root Cause

Ollama fails to run any model on Apple M5 hardware. The Metal shader compilation fails with a bfloat/half type mismatch in MetalPerformancePrimitives.

Fix Action

Workaround

Setting GGML_METAL_BF16_DISABLE=1 before starting the server allows models to load and run (CPU fallback, no GPU acceleration):

GGML_METAL_BF16_DISABLE=1 ollama serve

RAW_BUFFERClick to expand / collapse

Description

Ollama fails to run any model on Apple M5 hardware. The Metal shader compilation fails with a bfloat/half type mismatch in MetalPerformancePrimitives.

Environment

Ollama version: 0.21.2
macOS: Sequoia 15.3+ (Apple M5)
GPU: Apple M5 (MTLGPUFamilyApple10)
VRAM: ~19GB available

Error

From ~/.ollama/logs/server.log:

Impact

All models fail with: 500 Internal Server Error: llama runner process has terminated OLLAMA_NO_METAL=1 is silently ignored in v0.21.2.

Workaround

Setting GGML_METAL_BF16_DISABLE=1 before starting the server allows models to load and run (CPU fallback, no GPU acceleration):

GGML_METAL_BF16_DISABLE=1 ollama serve

Expected Behavior

Ollama should either:

Ship a precompiled Metal library for MTLGPUFamilyApple10 (M5), or
Correctly fall back to CPU when Metal compilation fails, or
Honor OLLAMA_NO_METAL=1 to explicitly disable Metal

extent analysis

TL;DR

Setting the environment variable GGML_METAL_BF16_DISABLE=1 before starting the Ollama server allows models to load and run, albeit with CPU fallback and no GPU acceleration.

Guidance

The error is caused by a bfloat16/half type mismatch in the MetalPerformancePrimitives framework, which prevents Metal shader compilation on Apple M5 hardware.
To verify the workaround, set GGML_METAL_BF16_DISABLE=1 and restart the Ollama server, then check if models can be loaded and run successfully.
The workaround can be applied by running the command GGML_METAL_BF16_DISABLE=1 ollama serve to start the server with CPU fallback.
Note that this workaround disables GPU acceleration, so performance may be impacted.

Example

GGML_METAL_BF16_DISABLE=1 ollama serve

Notes

The provided workaround is a temporary solution until a precompiled Metal library for MTLGPUFamilyApple10 (M5) is available or the Metal compilation fallback issue is resolved.

Recommendation

Apply the workaround by setting GGML_METAL_BF16_DISABLE=1 to enable CPU fallback and allow models to run, as this is the only available solution until the underlying issue is addressed.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#pipeline error #runtime error #dependency conflict #environment setup #docker error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix Metal backend crash on Apple M5 - bfloat/half type mismatch in MPPTensorOpsMatMul2d (v0.21.2) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Description

Environment

Error

Impact

Workaround

Expected Behavior

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix Metal backend crash on Apple M5 - bfloat/half type mismatch in MPPTensorOpsMatMul2d (v0.21.2) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Description

Environment

Error

Impact

Workaround

Expected Behavior

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING