ollama - 💡(How to fix) Fix Incorrectly calculates available system memory for qwen3.6:35b-a3b [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15650Fetched 2026-04-18 05:51:56
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
1
Author
Timeline (top)
commented ×1renamed ×1

Error Message

Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

Code Example

Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

---

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       1,9Gi       1,0Gi       1,0Mi        12Gi        13Gi
Swap:          4,0Gi       1,3Gi       2,7Gi

---

$ docker exec -it ollama-hosting-ollama-1 cat /proc/meminfo | grep MemAvailable
MemAvailable:   13829416 kB  (~13.8 GiB)

---

$ docker inspect ollama-hosting-ollama-1 | grep -i memory
            Memory: 0,
            MemoryReservation: 0,
            MemorySwap: 0,

---

$ docker-compose exec ollama ollama run qwen3.6:35b-a3b
Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)
RAW_BUFFERClick to expand / collapse

What is the issue?

Ollama fails to load qwen3.6:35b-a3b with a memory error, despite sufficient RAM being available. The memory calculation appears to be incorrect/conservative.

System specs:

  • OS: Ubuntu 22.04.3 LTS (Jammy Jellyfish)
  • Ollama version: 0.20.7
  • Host RAM: 15 GiB total, 13 GiB available (as reported by free -h)
  • GPU: 16 GB VRAM
  • Docker: No memory limits set (Memory: 0)
  • Container sees correct RAM: MemAvailable: 13829416 kB (~13.8 GiB)

Ollama's claim:

Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

Problem: Ollama reports only 8.1 GiB available when the container actually has 13.8 GiB free - a discrepancy of ~5.7 GiB.

Relevant log output

Host memory (outside container):

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       1,9Gi       1,0Gi       1,0Mi        12Gi        13Gi
Swap:          4,0Gi       1,3Gi       2,7Gi

Container memory (inside ollama container):

$ docker exec -it ollama-hosting-ollama-1 cat /proc/meminfo | grep MemAvailable
MemAvailable:   13829416 kB  (~13.8 GiB)

Docker memory limits:

$ docker inspect ollama-hosting-ollama-1 | grep -i memory
            Memory: 0,
            MemoryReservation: 0,
            MemorySwap: 0,

Ollama error:

$ docker-compose exec ollama ollama run qwen3.6:35b-a3b
Error: 500 Internal Server Error: model requires more system memory (9.7 GiB) than is available (8.1 GiB)

Steps to reproduce

  1. Run Ollama 0.20.7 in Docker on Ubuntu 22.04.3 LTS without memory limits
  2. Ensure host has >13 GiB available RAM
  3. Execute: ollama run qwen3.6:35b-a3b
  4. Observe memory error despite sufficient RAM

Expected behavior

Model should load successfully since:

  • Available RAM (13.8 GiB) > Required RAM (9.7 GiB)
  • GPU has 16 GB VRAM for offloading

Related issues

This appears to be related to:

  • #14501 - Similar issue with qwen3.5:35b-a3b and qwen3.5:27b-q4_K_M
  • #14557 - Memory calculation error (595 MB required, 19.7 MB reported available)
  • #14719 - deepseek-r1:70b memory requirements increased unexpectedly

OS

Ubuntu 22.04.3 LTS (Jammy Jellyfish)

GPU

Nvidia

Ollama version

0.20.7

Additional context

  • Swap is enabled (4GB total, 2.7GB free)
  • buff/cache: 12 GiB (Linux shows this as available, Ollama may not)
  • Setting OLLAMA_GPU_LAYERS=999 does not resolve the pre-flight memory check failure
  • Issue persists regardless of Docker memory configuration

Note: This is related to #14501, but for the newer qwen3.6:35b-a3b model on Ollama 0.20.7 with detailed memory analysis.

extent analysis

TL;DR

The most likely fix is to investigate and adjust Ollama's memory calculation to accurately reflect available system memory, considering the discrepancy between reported and actual available RAM.

Guidance

  • Verify that Ollama's memory calculation is correctly accounting for the buff/cache memory, as Linux reports this as available but Ollama may not.
  • Check if there are any environment variables or configuration options in Ollama that can be adjusted to influence the memory calculation or to override the pre-flight memory check.
  • Consider testing with a different model or version to see if the issue is specific to qwen3.6:35b-a3b or a more general problem with Ollama's memory management.
  • Review related issues (#14501, #14557, #14719) for any insights or fixes that might apply to this scenario.

Example

No specific code snippet can be provided without more details on Ollama's internal memory calculation logic. However, if Ollama provides an API or configuration for adjusting memory thresholds, an example might look like setting an environment variable: OLLAMA_MEMORY_THRESHOLD_OVERRIDE=true or adjusting a config file to reflect the actual available memory.

Notes

The issue seems to stem from a discrepancy in how Ollama calculates available memory versus the actual available memory as reported by the system. The fact that setting OLLAMA_GPU_LAYERS=999 does not resolve the issue suggests that the problem lies in the pre-flight memory check rather than GPU-specific memory allocation.

Recommendation

Apply a workaround by attempting to adjust Ollama's configuration or environment variables to correctly account for available system memory, as directly upgrading to a fixed version is not mentioned as an option in the provided context. This approach is chosen because it directly addresses the observed discrepancy in memory calculation, which is the root cause of the failure.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Model should load successfully since:

  • Available RAM (13.8 GiB) > Required RAM (9.7 GiB)
  • GPU has 16 GB VRAM for offloading

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix Incorrectly calculates available system memory for qwen3.6:35b-a3b [1 comments, 2 participants]