ollama - 💡(How to fix) Fix 0.18.x idle VRAM usage and power consumption [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15055Fetched 2026-04-08 01:26:29
View on GitHub
Comments
1
Participants
1
Timeline
3
Reactions
2
Participants
Timeline (top)
commented ×1labeled ×1renamed ×1

Error Message

I don't know whether it is relevant, but the following error only exists in 0.18.x log (about 3 seconds after server start, reproducible). In 0.17.7, on such error. Error #01: write tcp 127.0.0.1:11434->127.0.0.1:54305: wsasend: An established connection was aborted by the software in your host machine.

Code Example

Wed Mar 25 10:32:19 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.59                 Driver Version: 591.59         CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                   TCC   |   00000000:01:00.0 Off |                  Off |
|  0%   58C    P0             85W /  300W |     272MiB /  49140MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A           15276      C   ...al\Programs\Ollama\ollama.exe        262MiB |
+-----------------------------------------------------------------------------------------+

---

Wed Mar 25 11:02:20 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.59                 Driver Version: 591.59         CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                   TCC   |   00000000:01:00.0 Off |                  Off |
|  0%   45C    P8             14W /  300W |      10MiB /  49140MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

---

I don't know whether it is relevant, but the following error only exists in 0.18.x log (about 3 seconds after server start, reproducible). In 0.17.7, on such error.

Error #01: write tcp 127.0.0.1:11434->127.0.0.1:54305: wsasend: An established connection was aborted by the software in your host machine.
RAW_BUFFERClick to expand / collapse

What is the issue?

I was using Ollama 0.17.7 under Windows 11 and everything is fine. However, after I updated to 0.18.2, my fans become noisy even if idle. The output of nvidia-smi shows that a ollama process is using 262MB VRAM, even if ollama is idle (Not running any models, only system tray icon).

Wed Mar 25 10:32:19 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.59                 Driver Version: 591.59         CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                   TCC   |   00000000:01:00.0 Off |                  Off |
|  0%   58C    P0             85W /  300W |     272MiB /  49140MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A           15276      C   ...al\Programs\Ollama\ollama.exe        262MiB |
+-----------------------------------------------------------------------------------------+

At the same time, ollama ps says no model is running.

Downgrading to 0.18.0, the same problem.

Downgrading to 0.17.7, everything is OK again. The output of nvidia-smi is normal.

Wed Mar 25 11:02:20 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.59                 Driver Version: 591.59         CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                   TCC   |   00000000:01:00.0 Off |                  Off |
|  0%   45C    P8             14W /  300W |      10MiB /  49140MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

So why are the VRAM used in 0.18.x when idle? Is this a new feature (if yes, can I manually turn it off?) or just a bug? I can't accept 70 Watts additional idle power!

Relevant log output

I don't know whether it is relevant, but the following error only exists in 0.18.x log (about 3 seconds after server start, reproducible). In 0.17.7, on such error.

Error #01: write tcp 127.0.0.1:11434->127.0.0.1:54305: wsasend: An established connection was aborted by the software in your host machine.

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.18.2

extent analysis

Fix Plan

To address the issue of high VRAM usage by Ollama 0.18.x when idle, we'll explore a potential fix by modifying the configuration to reduce or eliminate unnecessary GPU resource allocation.

  1. Check for Configuration Options: Review the Ollama documentation and configuration files for any settings related to GPU resource management or idle behavior.
  2. Disable GPU Acceleration: If possible, try disabling GPU acceleration for Ollama when it's not in use. This might be achievable through a configuration setting or a command-line flag.
  3. Implement a Workaround: Create a simple script or batch file that periodically checks if Ollama is idle and, if so, attempts to release any unnecessary GPU resources. This could involve using NVIDIA's API or a similar mechanism to manage GPU memory allocation.

Example code snippet (batch file) to release GPU resources:

@echo off
set "ollama_pid="

:: Find the Ollama process ID
for /f "tokens=2" %%a in ('tasklist ^| findstr ollama.exe') do set "ollama_pid=%%a"

:: Check if Ollama is idle (no models running)
if "%ollama_pid%" neq "" (
    :: Attempt to release GPU resources (example using NVIDIA's API)
    nvidia-smi --query-gpu=memory.free --format=csv,noheader | findstr /r /c:"[0-9]"
    :: If the above command returns a non-zero value, it may indicate that GPU resources are still in use
    :: Add additional logic here to release resources or restart Ollama
)

Note: The above script is a basic example and may require modifications to work correctly in your environment.

Verification

To verify that the fix worked:

  1. Run the modified configuration or script.
  2. Monitor the nvidia-smi output to check if VRAM usage decreases when Ollama is idle.
  3. Verify that the fans return to a normal noise level.

Extra Tips

  • Regularly review the Ollama documentation and release notes for updates on GPU resource management and idle behavior.
  • Consider reporting the issue to the Ollama development team to request a permanent fix or additional configuration options.
  • If you're experiencing similar issues with other GPU-intensive applications, investigate whether they have similar configuration options or workarounds to manage GPU resource allocation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING