ollama - 💡(How to fix) Fix macOS App: web process using over 100% CPU during thinking [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15797Fetched 2026-04-25 06:03:32
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
labeled ×1
RAW_BUFFERClick to expand / collapse

What is the issue?

When running a model that has extensive "thinking" output, the Ollama Web Process is using 100% CPU.
This didn't use to happen on older versions.

A sample of the process (attached) suggests to me that the UI is thrashing, could it be refreshing the entire view after every single character of output?

To reproduce:

  • on macOS 15.7.5
  • Ollama latest (0.21.2)
  • run a model and prompt which generates a ton of thinking. I'm using qwen3.6:35b-a3b-nvfp4 with this prompt: "Puzzle game: there are 16 words, grouped into 4 categories, which can include semantic, phonetic, or more tricky (such as adding or removing a letter or word). The 4 categories vary in difficulty, from obvious to obscure. Please tell me the 4 words in each category, and define the category, and list the categories in order from easiest to hardest. Here are the 16 words: calliope superiority ringmaster atlas oedipus buzzard echo electra trace dialect inferiority dictionary thesaurus reminder encyclopedia vestige"
  • as the model thinks, watch CPU usage of the web process (which will look like this: http://127.0.0.1:xxxxx in Activity monitor

Expected result: there should be some CPU usage to refresh the window, but it should not be 100% Actual result: 100% or more. You will also notice other performance issues, such as dragging the ollama window around is slow and chunky.

Sample of Ollama Web Content.txt

Regression testing: I feel like this was not as bad as few versions ago, but I'm not sure exactly where this issue came in.

Relevant log output

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.21.2

extent analysis

TL;DR

The issue can be mitigated by optimizing the UI refresh mechanism to reduce the frequency of updates during extensive model output.

Guidance

  • Review the UI code to identify the cause of the thrashing issue, potentially related to refreshing the entire view after every single character of output.
  • Consider implementing a debouncing or throttling mechanism to limit the frequency of UI updates during model output.
  • Investigate the performance impact of the qwen3.6:35b-a3b-nvfp4 model and prompt, as it may be contributing to the high CPU usage.
  • Test earlier versions of Ollama to identify when the issue was introduced and compare the UI behavior.

Example

No code snippet is provided due to the lack of specific code details in the issue.

Notes

The issue may be specific to the combination of the qwen3.6:35b-a3b-nvfp4 model, prompt, and Ollama version 0.21.2. Further investigation is needed to determine the root cause and develop a comprehensive fix.

Recommendation

Apply a workaround by optimizing the UI refresh mechanism, as the issue is likely related to the UI thrashing during model output.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING