ollama - 💡(How to fix) Fix Running deepseek-r1:32b doesn't use my 3090s for inferencing anymore [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#14720Fetched 2026-04-08 00:32:36
View on GitHub
Comments
3
Participants
2
Timeline
5
Reactions
0
Timeline (top)
commented ×3closed ×1labeled ×1
RAW_BUFFERClick to expand / collapse

What is the issue?

I am running the deepseek-r1:32b model and it loaded the LLM into the VRAM split between my two 3090s, but now, it doesn't use neither of them for the actual inferencing/responding part.

I am not sure what changed nor when it changed, but I just noticed this behaviour today.

Relevant log output

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.17.7

extent analysis

Fix Plan

The issue seems to be related to the model not utilizing the GPU for inferencing. To fix this, we need to ensure that the model is configured to use the GPU.

Steps to Fix

  • Check if the CUDA version is compatible with the Nvidia drivers.
  • Verify that the deepseek-r1:32b model is configured to use the GPU.
  • Update the Ollama configuration to specify the GPU devices.

Example Code

import torch

# Set the device to use the GPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Move the model to the GPU
model.to(device)

# Specify the GPU devices in the Ollama configuration
config = {
    "model": "deepseek-r1:32b",
    "device": "cuda:0",
    "gpu_devices": [0, 1]  # Use both GPU devices
}

Verification

To verify that the fix worked, check the GPU utilization using tools like nvidia-smi while running the model. The GPU usage should increase when the model is running.

Extra Tips

  • Ensure that the Nvidia drivers are up-to-date.
  • Check the Ollama documentation for any specific configuration options related to GPU usage.
  • If using a multi-GPU setup, ensure that the model is configured to use both GPUs.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix Running deepseek-r1:32b doesn't use my 3090s for inferencing anymore [3 comments, 2 participants]