ollama - 💡(How to fix) Fix Running deepseek-r1:32b doesn't use my 3090s for inferencing anymore [3 comments, 2 participants]

ollama2026-03-08 17:40:45

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#14720•Fetched 2026-04-08 00:32:36

View on GitHub

Comments

Participants

Timeline

Reactions

Author

alpha754293

Participants

alpha754293

rick-github

Timeline (top)

commented ×3closed ×1labeled ×1

RAW_BUFFERClick to expand / collapse

What is the issue?

I am running the deepseek-r1:32b model and it loaded the LLM into the VRAM split between my two 3090s, but now, it doesn't use neither of them for the actual inferencing/responding part.

I am not sure what changed nor when it changed, but I just noticed this behaviour today.

Relevant log output

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.17.7

extent analysis

Fix Plan

The issue seems to be related to the model not utilizing the GPU for inferencing. To fix this, we need to ensure that the model is configured to use the GPU.

Steps to Fix

Check if the CUDA version is compatible with the Nvidia drivers.
Verify that the deepseek-r1:32b model is configured to use the GPU.
Update the Ollama configuration to specify the GPU devices.

Example Code

import torch

# Set the device to use the GPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Move the model to the GPU
model.to(device)

# Specify the GPU devices in the Ollama configuration
config = {
    "model": "deepseek-r1:32b",
    "device": "cuda:0",
    "gpu_devices": [0, 1]  # Use both GPU devices
}

Verification

To verify that the fix worked, check the GPU utilization using tools like nvidia-smi while running the model. The GPU usage should increase when the model is running.

Extra Tips

Ensure that the Nvidia drivers are up-to-date.
Check the Ollama documentation for any specific configuration options related to GPU usage.
If using a multi-GPU setup, ensure that the model is configured to use both GPUs.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #pipeline error #runtime error #dependency conflict #environment setup #docker error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix Running deepseek-r1:32b doesn't use my 3090s for inferencing anymore [3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

Fix Plan

Steps to Fix

Example Code

Verification

Extra Tips

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix Running deepseek-r1:32b doesn't use my 3090s for inferencing anymore [3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

Fix Plan

Steps to Fix

Example Code

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING