ollama - 💡(How to fix) Fix Parallel task managing [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15195Fetched 2026-04-08 02:22:41
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1subscribed ×1

Code Example

[GIN] 2026/03/31 - 19:55:44 | 200 | 84.3503ms | 127.0.0.1 | HEAD "/"
[GIN] 2026/03/31 - 19:55:45 | 200 | 111.554ms | 127.0.0.1 | GET "/api/ps"
[GIN] 2026/03/31 - 20:13:45 | 200 | 1.4590673s | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:14:56 | 200 | 412.5284ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:15:08 | 200 | 194.5318ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:15:11 | 200 | 267.7857ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:15:14 | 200 | 530.0361ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:15:56 | 200 | 635.156ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:16:04 | 200 | 301.8981ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:19:23 | 200 | 4.327851s | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:31:18 | 200 | 127.1235ms | 127.0.0.1 | HEAD "/"
[GIN] 2026/03/31 - 20:31:18 | 200 | 32.8794ms | 127.0.0.1 | GET "/api/ps"
[GIN] 2026/03/31 - 22:17:53 | 200 | 2h24m6s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 00:19:53 | 200 | 1.2835584s | 127.0.0.1 | HEAD "/"
[GIN] 2026/04/01 - 00:19:56 | 200 | 888.8263ms | 127.0.0.1 | GET "/api/ps"
[GIN] 2026/04/01 - 00:43:01 | 200 | 4h48m46s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 01:03:51 | 200 | 7.1998ms | 127.0.0.1 | HEAD "/"
[GIN] 2026/04/01 - 01:03:51 | 200 | 13.6305ms | 127.0.0.1 | GET "/api/ps"
[GIN] 2026/04/01 - 02:22:52 | 200 | 6h28m36s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 02:30:31 | 200 | 6h36m16s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 02:42:20 | 200 | 6h48m5s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 02:51:54 | 200 | 6h58m7s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 03:35:48 | 200 | 7h41m33s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 03:40:55 | 200 | 7h46m40s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 03:52:03 | 200 | 7h57m47s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 04:52:13 | 200 | 8h57m57s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 05:12:38 | 200 | 9h18m22s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 05:52:32 | 200 | 9h58m5s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 06:15:38 | 200 | 10h21m28s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 06:34:57 | 200 | 10h40m29s | 127.0.0.1 | POST "/api/chat"
RAW_BUFFERClick to expand / collapse

What is the issue?

https://github.com/lfnovo/open-notebook/issues/711

I don't want parallelism in my setup, so I have this config but doesn't work:

powershell -NoExit -Command "$host.UI.RawUI.WindowTitle = 'Ollama LLMs';$env:OLLAMA_GPU='1';$env:OLLAMA_NUM_THREADS='10';$env:OMP_NUM_THREADS='10';$env:OLLAMA_GPU_MEMORY_FRACTION='1';$env:OLLAMA_MAX_LOADED_MODELS='1';$env:OLLAMA_NUM_PARALLEL='1';$env:OLLAMA_KV_CACHE_TYPE='q8_0';$env:OLLAMA_NEW_ENGINE='1';$env:OLLAMA_VULKAN='0';$env:OLLAMA_MAX_QUEUE='1';$env:OLLAMA_HOST='127.0.0.1:11435';$env:OLLAMA_SCHED_SPREAD='1';$env:OLLAMA_FLASH_ATTENTION='1';ollama serve"

Relevant log output

[GIN] 2026/03/31 - 19:55:44 | 200 | 84.3503ms | 127.0.0.1 | HEAD "/"
[GIN] 2026/03/31 - 19:55:45 | 200 | 111.554ms | 127.0.0.1 | GET "/api/ps"
[GIN] 2026/03/31 - 20:13:45 | 200 | 1.4590673s | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:14:56 | 200 | 412.5284ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:15:08 | 200 | 194.5318ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:15:11 | 200 | 267.7857ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:15:14 | 200 | 530.0361ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:15:56 | 200 | 635.156ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:16:04 | 200 | 301.8981ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:19:23 | 200 | 4.327851s | 127.0.0.1 | GET "/api/tags"
[GIN] 2026/03/31 - 20:31:18 | 200 | 127.1235ms | 127.0.0.1 | HEAD "/"
[GIN] 2026/03/31 - 20:31:18 | 200 | 32.8794ms | 127.0.0.1 | GET "/api/ps"
[GIN] 2026/03/31 - 22:17:53 | 200 | 2h24m6s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 00:19:53 | 200 | 1.2835584s | 127.0.0.1 | HEAD "/"
[GIN] 2026/04/01 - 00:19:56 | 200 | 888.8263ms | 127.0.0.1 | GET "/api/ps"
[GIN] 2026/04/01 - 00:43:01 | 200 | 4h48m46s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 01:03:51 | 200 | 7.1998ms | 127.0.0.1 | HEAD "/"
[GIN] 2026/04/01 - 01:03:51 | 200 | 13.6305ms | 127.0.0.1 | GET "/api/ps"
[GIN] 2026/04/01 - 02:22:52 | 200 | 6h28m36s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 02:30:31 | 200 | 6h36m16s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 02:42:20 | 200 | 6h48m5s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 02:51:54 | 200 | 6h58m7s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 03:35:48 | 200 | 7h41m33s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 03:40:55 | 200 | 7h46m40s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 03:52:03 | 200 | 7h57m47s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 04:52:13 | 200 | 8h57m57s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 05:12:38 | 200 | 9h18m22s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 05:52:32 | 200 | 9h58m5s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 06:15:38 | 200 | 10h21m28s | 127.0.0.1 | POST "/api/chat"
[GIN] 2026/04/01 - 06:34:57 | 200 | 10h40m29s | 127.0.0.1 | POST "/api/chat"

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.18.3

extent analysis

TL;DR

  • The issue might be resolved by adjusting the environment variables related to parallelism and threading in the PowerShell command.

Guidance

  • Review the environment variables set in the PowerShell command, specifically OLLAMA_NUM_THREADS, OMP_NUM_THREADS, and OLLAMA_NUM_PARALLEL, to ensure they are correctly configured for the desired level of parallelism.
  • Verify that the OLLAMA_NUM_PARALLEL variable is set to 1 as intended, to disable parallelism.
  • Check the documentation for Ollama version 0.18.3 to confirm the expected behavior of these environment variables.
  • Consider testing the command with different values for OLLAMA_NUM_THREADS and OMP_NUM_THREADS to observe their impact on performance.

Example

No specific code example is provided due to the lack of explicit details about the desired configuration or the exact issue encountered.

Notes

  • The provided log output does not directly indicate the cause of the issue, but it shows a mix of short and long response times for different API calls, which might be related to the parallelism settings.
  • The effectiveness of the suggested adjustments depends on the specific requirements and constraints of the Ollama setup and the system it is running on.

Recommendation

  • Apply workaround: Adjust the environment variables related to parallelism and threading to match the desired setup, and monitor the system's performance to determine if the issue is resolved.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING