ollama - 💡(How to fix) Fix Metal backend crash on Apple Silicon M5: MTLLibrary bfloat/half mismatch causes llama runner termination (500) [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15541Fetched 2026-04-15 06:20:20
View on GitHub
Comments
3
Participants
2
Timeline
4
Reactions
1
Timeline (top)
commented ×3closed ×1

Ollama consistently fails to run any model due to a Metal backend initialization failure. The runner process terminates and the API returns 500.

Error Message

  1. Observe server-side runner crash and client-side 500 error.
  • 500 Internal Server Error: llama runner process has terminated: %!w(<nil>)

Error Signature (sanitized)

  • ggml_metal_init: error: failed to initialize the Metal library

Root Cause

Impact

Ollama is unusable for inference in this environment because all model runs terminate before serving.

Fix Action

Fix / Workaround

Workarounds Tried

  • Reinstall Ollama
  • Install/update Xcode and Metal toolchain components
  • Restart app/processes
  • Alternate host/port setup
  • Disable flash attention
  • KV cache type adjustments
RAW_BUFFERClick to expand / collapse

Summary

Ollama consistently fails to run any model due to a Metal backend initialization failure. The runner process terminates and the API returns 500.

Environment

  • Ollama version: 0.20.6
  • Platform: macOS 26.3.1 (build 25D771280a)
  • Hardware: Apple Silicon M5
  • Install type: Ollama desktop app + CLI
  • Reproducibility: 100% (all tested models)

Steps to Reproduce

  1. Start Ollama.
  2. Run any model prompt, for example:
    • ollama run gemma2:2b "Hello"
  3. Observe server-side runner crash and client-side 500 error.

Expected Behavior

Model loads successfully and returns a response.

Actual Behavior

Model load fails during Metal initialization; llama runner exits; client receives:

  • 500 Internal Server Error: llama runner process has terminated: %!w(<nil>)

Error Signature (sanitized)

  • static_assert failed: Input types must match cooperative tensor types
  • bfloat/half mismatch in MetalPerformancePrimitives matmul path
  • ggml_metal_init: error: failed to initialize the Metal library
  • llama_init_from_model: failed to initialize the context: failed to initialize Metal backend
  • panic: unable to create llama context
  • llama runner terminated: exit status 2

Representative framework references:

  • /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3266
  • /System/Library/Frameworks/MetalPerformancePrimitives.framework/Headers/__impl/MPPTensorOpsMatMul2dImpl.h:3267

Additional Observations

  • Failure occurs across multiple models, not model-specific.
  • Model blobs download and verification succeed.
  • Runtime fails at backend initialization stage.

Workarounds Tried

  • Reinstall Ollama
  • Install/update Xcode and Metal toolchain components
  • Restart app/processes
  • Alternate host/port setup
  • Disable flash attention
  • KV cache type adjustments

Result: same failure path in Metal backend.

Impact

Ollama is unusable for inference in this environment because all model runs terminate before serving.

Privacy Note

All user-identifying information and local absolute paths have been redacted.

extent analysis

TL;DR

The most likely fix is to investigate and resolve the Metal backend initialization failure, potentially related to a bfloat/half mismatch in MetalPerformancePrimitives.

Guidance

  • Verify that the Metal framework and its dependencies are correctly installed and up-to-date on the macOS system, as the error signature suggests issues with MetalPerformancePrimitives.
  • Check the compatibility of the Ollama version (0.20.6) with the macOS version (26.3.1) and Apple Silicon M5 hardware, as the failure might be related to a specific version or hardware combination.
  • Investigate the bfloat/half mismatch error further, as it might indicate a numerical precision issue that needs to be addressed, potentially through configuration changes or updates to the MetalPerformancePrimitives framework.
  • Consider reaching out to the Ollama community or developers for further assistance, as the issue seems to be related to a low-level library interaction.

Example

No code snippet is provided, as the issue seems to be related to a low-level library interaction and not a specific code snippet.

Notes

The issue might be specific to the combination of Ollama version, macOS version, and Apple Silicon M5 hardware. Further investigation is needed to determine the root cause and find a suitable solution.

Recommendation

Apply workaround: Investigate and resolve the Metal backend initialization failure, as it is the most likely cause of the issue. This might involve updating or reconfiguring the Metal framework and its dependencies.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix Metal backend crash on Apple Silicon M5: MTLLibrary bfloat/half mismatch causes llama runner termination (500) [3 comments, 2 participants]