ollama - 💡(How to fix) Fix macOS App: web process using over 100% CPU during thinking [1 participants]

ollama2026-04-24 14:21:49

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15797•Fetched 2026-04-25 06:03:32

View on GitHub

Comments

Participants

Timeline

Reactions

Author

xmddmx

Participants

xmddmx

Timeline (top)

labeled ×1

RAW_BUFFERClick to expand / collapse

What is the issue?

When running a model that has extensive "thinking" output, the Ollama Web Process is using 100% CPU.
This didn't use to happen on older versions.

A sample of the process (attached) suggests to me that the UI is thrashing, could it be refreshing the entire view after every single character of output?

To reproduce:

on macOS 15.7.5
Ollama latest (0.21.2)
run a model and prompt which generates a ton of thinking. I'm using qwen3.6:35b-a3b-nvfp4 with this prompt: "Puzzle game: there are 16 words, grouped into 4 categories, which can include semantic, phonetic, or more tricky (such as adding or removing a letter or word). The 4 categories vary in difficulty, from obvious to obscure. Please tell me the 4 words in each category, and define the category, and list the categories in order from easiest to hardest. Here are the 16 words: calliope superiority ringmaster atlas oedipus buzzard echo electra trace dialect inferiority dictionary thesaurus reminder encyclopedia vestige"
as the model thinks, watch CPU usage of the web process (which will look like this: http://127.0.0.1:xxxxx in Activity monitor

Expected result: there should be some CPU usage to refresh the window, but it should not be 100% Actual result: 100% or more. You will also notice other performance issues, such as dragging the ollama window around is slow and chunky.

Sample of Ollama Web Content.txt

Regression testing: I feel like this was not as bad as few versions ago, but I'm not sure exactly where this issue came in.

Relevant log output

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.21.2

extent analysis

TL;DR

The issue can be mitigated by optimizing the UI refresh mechanism to reduce the frequency of updates during extensive model output.

Guidance

Review the UI code to identify the cause of the thrashing issue, potentially related to refreshing the entire view after every single character of output.
Consider implementing a debouncing or throttling mechanism to limit the frequency of UI updates during model output.
Investigate the performance impact of the qwen3.6:35b-a3b-nvfp4 model and prompt, as it may be contributing to the high CPU usage.
Test earlier versions of Ollama to identify when the issue was introduced and compare the UI behavior.

Example

No code snippet is provided due to the lack of specific code details in the issue.

Notes

The issue may be specific to the combination of the qwen3.6:35b-a3b-nvfp4 model, prompt, and Ollama version 0.21.2. Further investigation is needed to determine the root cause and develop a comprehensive fix.

Recommendation

Apply a workaround by optimizing the UI refresh mechanism, as the issue is likely related to the UI thrashing during model output.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#autograd error #model save/load #optimization #mixed precision #training loop

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix macOS App: web process using over 100% CPU during thinking [1 participants]

Recommended Tools

GitHub issue graph ai analysis

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix macOS App: web process using over 100% CPU during thinking [1 participants]

Recommended Tools

GitHub issue graph ai analysis

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING