ollama - 💡(How to fix) Fix Why does running deepseek-r1:32b require over 300 GB of memory? [1 comments, 2 participants]

ollama2026-03-25 01:26:39

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15049•Fetched 2026-04-08 01:26:35

View on GitHub

Comments

Participants

Timeline

Reactions

Author

yangy996

Participants

rick-github

yangy996

Timeline (top)

closed ×1commented ×1labeled ×1

Error Message

{"error":"model requires more system memory (338.1 GiB) than is available (328.0 GiB)"} 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=WARN source=server.go:1044 msg="model request too large for system" requested="338.1 GiB" available="337.2>

RAW_BUFFERClick to expand / collapse

What is the issue?

curl http://localhost:11434/api/generate -d '{ "model": "deepseek-r1:32b", "prompt": "Hello, how are you?", "stream": false }' {"error":"model requires more system memory (338.1 GiB) than is available (328.0 GiB)"}

Relevant log output

● ollama.service - Ollama Service Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2026-03-24 17:47:32 CST; 15h ago Main PID: 2816880 (ollama) Tasks: 24 (limit: 618634) Memory: 1.1G CGroup: /system.slice/ollama.service └─2816880 /usr/local/bin/ollama serve

3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=DEBUG source=server.go:976 msg="available gpu" id=GPU-fdfb4273-9c05-65a3-6882-21542da92ff6 library=CUDA "a> 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=DEBUG source=server.go:976 msg="available gpu" id=GPU-e42673e2-1cc5-b890-6193-de099240bffe library=CUDA "a> 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=WARN source=server.go:1044 msg="model request too large for system" requested="338.1 GiB" available="337.2> 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=INFO source=sched.go:516 msg="Load failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-6150cb38231> 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.853+08:00 level=INFO source=runner.go:965 msg="starting go runner" 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.853+08:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.873+08:00 level=DEBUG source=server.go:1830 msg="stopping llama server" pid=325701 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.873+08:00 level=DEBUG source=server.go:1836 msg="waiting for llama server to exit" pid=325701 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.882+08:00 level=DEBUG source=server.go:1840 msg="llama server stopped" pid=325701 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: [GIN] 2026/03/25 - 09:28:05 | 500 | 1.744758145s | 127.0.0.1 | POST "/api/generate"

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.18.0

extent analysis

Fix Plan

The fix involves increasing the system memory or optimizing the model to require less memory. Here are the steps:

Increase System Memory:
1. Check if it's possible to add more physical RAM to the system.
2. If not, consider using a cloud provider that offers more memory options.
Optimize Model Memory Usage:
1. Try using a smaller model, such as "deepseek-r1:16b" or "deepseek-r1:8b".
2. Implement model pruning or quantization to reduce memory requirements.
Configure Ollama to Use GPU:
1. Ensure that the Nvidia GPU is properly installed and configured.
2. Set the OLLAMA_GPU environment variable to the ID of the available GPU (e.g., export OLLAMA_GPU=0).
Update Ollama Configuration:
1. Edit the ollama.config file to include the following settings:

memory_limit: 350GiB gpu_id: 0 ``` 2. Restart the Ollama service after updating the configuration.

Example Code

To implement model pruning, you can use the following Python code:

import torch
import torch.nn as nn

# Load the model
model = torch.load("deepseek-r1:32b.pth")

# Prune the model
parameters_to_prune = (
    (model.encoder, 'weight'),
    (model.decoder, 'weight'),
)
torch.nn.utils.prune.global_unstructured(
    parameters_to_prune,
    pruning_method=torch.nn.utils.prune.L1Unstructured,
    amount=0.2,
)

# Save the pruned model
torch.save(model, "pruned_deepseek-r1:32b.pth")

Verification

To verify that the fix worked, run the following command:

curl http://localhost:11434/api/generate -d '{
  "model": "pruned_deepseek-r1:32b",
  "prompt": "Hello, how are you?",
  "stream": false
}'

If the response is successful, it should return a generated text without any memory errors.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #environment variable #network issue #logging issue #authentication issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix Why does running deepseek-r1:32b require over 300 GB of memory? [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

Fix Plan

Example Code

Verification

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix Why does running deepseek-r1:32b require over 300 GB of memory? [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

Fix Plan

Example Code

Verification

Still need to ship something?

RELATED_DISCOVERY

TRENDING