Model should load successfully since: - Available RAM (13.8 GiB) > Required RAM (9.7 GiB) - GPU has 16 GB VRAM for offloading

ollama - 💡(How to fix) Fix Incorrectly calculates available system memory for qwen3.6:35b-a3b [1 comments, 2 participants]

ollama2026-04-17 17:08:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15650•Fetched 2026-04-18 05:51:56

View on GitHub

Comments

Participants

Timeline

Reactions

Author

fullheart

Participants

fullheart

rick-github

Timeline (top)

commented ×1renamed ×1

Error Message

Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

Code Example

Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

---

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       1,9Gi       1,0Gi       1,0Mi        12Gi        13Gi
Swap:          4,0Gi       1,3Gi       2,7Gi

---

$ docker exec -it ollama-hosting-ollama-1 cat /proc/meminfo | grep MemAvailable
MemAvailable:   13829416 kB  (~13.8 GiB)

---

$ docker inspect ollama-hosting-ollama-1 | grep -i memory
            Memory: 0,
            MemoryReservation: 0,
            MemorySwap: 0,

---

$ docker-compose exec ollama ollama run qwen3.6:35b-a3b
Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

RAW_BUFFERClick to expand / collapse

What is the issue?

Ollama fails to load qwen3.6:35b-a3b with a memory error, despite sufficient RAM being available. The memory calculation appears to be incorrect/conservative.

System specs:

OS: Ubuntu 22.04.3 LTS (Jammy Jellyfish)
Ollama version: 0.20.7
Host RAM: 15 GiB total, 13 GiB available (as reported by free -h)
GPU: 16 GB VRAM
Docker: No memory limits set (Memory: 0)
Container sees correct RAM: MemAvailable: 13829416 kB (~13.8 GiB)

Ollama's claim:

Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

Problem: Ollama reports only 8.1 GiB available when the container actually has 13.8 GiB free - a discrepancy of ~5.7 GiB.

Relevant log output

Host memory (outside container):

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       1,9Gi       1,0Gi       1,0Mi        12Gi        13Gi
Swap:          4,0Gi       1,3Gi       2,7Gi

Container memory (inside ollama container):

$ docker exec -it ollama-hosting-ollama-1 cat /proc/meminfo | grep MemAvailable
MemAvailable:   13829416 kB  (~13.8 GiB)

Docker memory limits:

$ docker inspect ollama-hosting-ollama-1 | grep -i memory
            Memory: 0,
            MemoryReservation: 0,
            MemorySwap: 0,

Ollama error:

$ docker-compose exec ollama ollama run qwen3.6:35b-a3b
Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

Steps to reproduce

Run Ollama 0.20.7 in Docker on Ubuntu 22.04.3 LTS without memory limits
Ensure host has >13 GiB available RAM
Execute: ollama run qwen3.6:35b-a3b
Observe memory error despite sufficient RAM

Expected behavior

Model should load successfully since:

Available RAM (13.8 GiB) > Required RAM (9.7 GiB)
GPU has 16 GB VRAM for offloading

Related issues

This appears to be related to:

#14501 - Similar issue with qwen3.5:35b-a3b and qwen3.5:27b-q4_K_M
#14557 - Memory calculation error (595 MB required, 19.7 MB reported available)
#14719 - deepseek-r1:70b memory requirements increased unexpectedly

OS

Ubuntu 22.04.3 LTS (Jammy Jellyfish)

GPU

Nvidia

Ollama version

0.20.7

Additional context

Swap is enabled (4GB total, 2.7GB free)
buff/cache: 12 GiB (Linux shows this as available, Ollama may not)
Setting OLLAMA_GPU_LAYERS=999 does not resolve the pre-flight memory check failure
Issue persists regardless of Docker memory configuration

Note: This is related to #14501, but for the newer qwen3.6:35b-a3b model on Ollama 0.20.7 with detailed memory analysis.

extent analysis

TL;DR

The most likely fix is to investigate and adjust Ollama's memory calculation to accurately reflect available system memory, considering the discrepancy between reported and actual available RAM.

Guidance

Verify that Ollama's memory calculation is correctly accounting for the buff/cache memory, as Linux reports this as available but Ollama may not.
Check if there are any environment variables or configuration options in Ollama that can be adjusted to influence the memory calculation or to override the pre-flight memory check.
Consider testing with a different model or version to see if the issue is specific to qwen3.6:35b-a3b or a more general problem with Ollama's memory management.
Review related issues (#14501, #14557, #14719) for any insights or fixes that might apply to this scenario.

Example

No specific code snippet can be provided without more details on Ollama's internal memory calculation logic. However, if Ollama provides an API or configuration for adjusting memory thresholds, an example might look like setting an environment variable: OLLAMA_MEMORY_THRESHOLD_OVERRIDE=true or adjusting a config file to reflect the actual available memory.

Notes

The issue seems to stem from a discrepancy in how Ollama calculates available memory versus the actual available memory as reported by the system. The fact that setting OLLAMA_GPU_LAYERS=999 does not resolve the issue suggests that the problem lies in the pre-flight memory check rather than GPU-specific memory allocation.

Recommendation

Apply a workaround by attempting to adjust Ollama's configuration or environment variables to correctly account for available system memory, as directly upgrading to a fixed version is not mentioned as an option in the provided context. This approach is chosen because it directly addresses the observed discrepancy in memory calculation, which is the root cause of the failure.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Model should load successfully since:

Available RAM (13.8 GiB) > Required RAM (9.7 GiB)
GPU has 16 GB VRAM for offloading

#vector store #embedding generation #cache error #pipeline error #runtime error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix Incorrectly calculates available system memory for qwen3.6:35b-a3b [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

What is the issue?

Relevant log output

Steps to reproduce

Expected behavior

Related issues

OS

GPU

Ollama version

Additional context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix Incorrectly calculates available system memory for qwen3.6:35b-a3b [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

What is the issue?

Relevant log output

Steps to reproduce

Expected behavior

Related issues

OS

GPU

Ollama version

Additional context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING