ollama - ✅(Solved) Fix Ollama not working on Mac M5 [1 pull requests, 2 comments, 3 participants]

ollama2026-04-21 18:48:14

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15734•Fetched 2026-04-22 07:43:38

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2labeled ×1

Error Message

Keep getting this error - 500 Internal Server Error: llama runner process has terminated: %!w(<nil>)

PR fix notes

PR #15755: metal: harden for ggml initialization failures

Repository: ollama/ollama
Author: dhiltgen
State: open | merged: False
Link: https://github.com/ollama/ollama/pull/15755

Description (problem / solution / changelog)

ggml_metal_device_init performs a probe to verify the tensor API compiles. On some systems this passes, even though kernel coverage isn't complete, which results in a later crash when compiling the real kernels. This change adds a single retry if any of the error strings match this failure mode to disable the tensor API. It also hardens an error case in the Go initDevices to detect device initialization failures and panic instead of crashing later on a nil array entry.

Fixes #15734

On my test system the probe test disables the feature, so the crash behavior isn't seen. To simulate the bug, I temporarily bypassed the probe so the API was enabled, and verified the crash, then the retry kicked in properly and got models running. While running that repro, I uncovered some other rough edges and hardened those as well.

GPU discovery used to hammer the /info API on failure cases, and cause multiple dummy loads concurrently, which broke the backend, and lead to a 30s hang/timeout instead of failing fast. This timeout has been a long running problem, and I now understand why, and this fixes it by synchronizing the dummy load inside the runner, so multiple /info calls queue up and return the device list after the initial load finishes (or fails once). This fix will likely help many users with unsupported AMD GPUs.
StatusWriter was getting concurrent writes from go routine copies, leading to weird interleaving of the stdout/stderr data - switched to a common cmd.Stdout/Stderr object which triggers os/exec to serialize writes.
StatusWriter was only capturing the last error which sometimes was a generic log message while the more pertinent error was detected earlier and overwritten - switched to an accumulation approach for all the matching patterns.

Changed files

discover/runner.go (modified, +44/-24)
llm/server.go (modified, +126/-40)
llm/server_wait_test.go (added, +31/-0)
llm/status.go (modified, +56/-6)
llm/status_test.go (added, +44/-0)
ml/backend/ggml/ggml.go (modified, +11/-3)
runner/ollamarunner/runner.go (modified, +44/-14)

RAW_BUFFERClick to expand / collapse

What is the issue?

Keep getting this error - 500 Internal Server Error: llama runner process has terminated: %!w(<nil>)

Relevant log output

OS

No response

GPU

No response

CPU

No response

Ollama version

No response

extent analysis

TL;DR

The 500 Internal Server Error may be resolved by investigating the cause of the llama runner process termination.

Guidance

Check the relevant log output for any error messages or stack traces that could indicate why the llama runner process terminated.
Investigate potential issues with the system resources, such as memory or CPU usage, that could be causing the process to crash.
Verify that the llama runner process is properly configured and that all dependencies are met.
Consider running the llama runner process in a debug mode or with increased logging to gather more information about the termination cause.

Notes

The lack of information about the OS, GPU, CPU, and Ollama version makes it difficult to provide a more specific solution. Additional details about the environment and configuration would be necessary to provide a more accurate diagnosis.

Recommendation

Apply workaround: Investigate and address the root cause of the llama runner process termination, as the exact fix is unclear without more information.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#prompt formatting #chain error #conversation history #tool integration #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - ✅(Solved) Fix Ollama not working on Mac M5 [1 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #15755: metal: harden for ggml initialization failures

Description (problem / solution / changelog)

Changed files

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - ✅(Solved) Fix Ollama not working on Mac M5 [1 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #15755: metal: harden for ggml initialization failures

Description (problem / solution / changelog)

Changed files

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING