ollama - ✅(Solved) Fix False "insufficient memory" error in LXC: Ollama appears to use MemFree instead of MemAvailable when loading models [1 pull requests, 1 participants]

ollama2026-04-19 15:54:28

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15704•Fetched 2026-04-20 11:59:13

View on GitHub

Comments

Participants

Timeline

Reactions

Author

avgex

Participants

avgex

Timeline (top)

cross-referenced ×1labeled ×1subscribed ×1

Error Message

Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=WARN source=server.go:1058 msg="model request too large for system" requested="18.3 GiB" available="12.6 GiB" total="32.0 GiB" free="12.1 GiB" swap="506.2 MiB" Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=sched.go:511 msg="Load failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-7121486771cbfe218851513210c40b35dbdee93ab1ef43fe36283c883980f0df error="model requires more system memory (18.3 GiB) than is available (12.6 GiB)"

Root Cause

Notes:

Restarting only the ollama service does not help
Restarting the whole container may temporarily help because cache is cleared
This looks especially relevant for containerized environments such as LXC

Fix Action

Fixed

Fixed by PR: fix: use MemAvailable equivalent in cgroup memory check (https://github.com/ollama/ollama/pull/15713)

PR fix notes

PR #15713: fix: use MemAvailable equivalent in cgroup memory check

Repository: ollama/ollama
Author: ssam18
State: open | merged: False
Link: https://github.com/ollama/ollama/pull/15713

Description (problem / solution / changelog)

In LXC and Docker containers, memory.current includes reclaimable page cache, so computing free memory as memory.max - memory.current was giving a result closer to MemFree than MemAvailable, causing false "insufficient memory" errors even when Linux could reclaim enough cache to load the model.

The fix reads inactive_file from /sys/fs/cgroup/memory.stat and subtracts it from the used total before computing free memory, matching how the kernel calculates MemAvailable. Fixes #15704

Changed files

discover/cpu_linux.go (modified, +23/-1)
discover/cpu_linux_test.go (modified, +55/-0)

Code Example

Apr 19 17:31:46 ollama ollama[199195]: time=2026-04-19T17:31:46.564+02:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="32.0 GiB" available="12.1 GiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.309+02:00 level=INFO source=sched.go:484 msg="system memory" total="32.0 GiB" free="12.1 GiB" free_swap="506.2 MiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=WARN source=server.go:1058 msg="model request too large for system" requested="18.3 GiB" available="12.6 GiB" total="32.0 GiB" free="12.1 GiB" swap="506.2 MiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=device.go:245 msg="model weights" device=CPU size="17.3 GiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=device.go:256 msg="kv cache" device=CPU size="880.0 MiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=device.go:267 msg="compute graph" device=CPU size="133.7 MiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=device.go:272 msg="total memory" size="18.3 GiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=sched.go:511 msg="Load failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-7121486771cbfe218851513210c40b35dbdee93ab1ef43fe36283c883980f0df error="model requires more system memory (18.3 GiB) than is available (12.6 GiB)"
Apr 19 17:32:08 ollama ollama[199195]: [GIN] 2026/04/19 - 17:32:08 | 500 | 595.882098ms | 192.168.1.239 | POST "/api/chat"

Inside the container:

free -h
---------------
               total        used        free      shared  buff/cache   available
Mem:            32Gi        44Mi        12Gi        72Ki        19Gi        31Gi
Swap:          512Mi       5.8Mi       506Mi

/proc/meminfo
---------------
MemTotal:       33554432 kB
MemFree:        12726260 kB
MemAvailable:   33508662 kB
SwapTotal:        524288 kB
SwapFree:         518352 kB

/sys/fs/cgroup/memory.stat
---------------
anon 36884480
file 21266915328
inactive_file 21189849088
active_file 77045760

RAW_BUFFERClick to expand / collapse

What is the issue?

Ollama refuses to load a model in an LXC container even though the container has enough reclaimable memory available.

Environment:

Proxmox LXC container
Debian guest
Container memory: 32 GiB
Container swap: 512 MiB
CPU inference only

Observed behavior: When I try to load gemma4:26b, Ollama returns:

model requires more system memory (18.3 GiB) than is available (12.6 GiB)

However, inside the container:

MemFree is about 12 GiB
MemAvailable is about 31.9 GiB
almost no RAM is used by processes
about 21 GiB is file cache, mostly inactive_file, so it should be reclaimable by Linux

This suggests Ollama is using something close to MemFree + SwapFree for its pre-load memory check, while ignoring reclaimable page cache / MemAvailable.

Why I believe this is incorrect:

free -h shows:
- free: ~12 GiB
- available: ~31 GiB
/sys/fs/cgroup/memory.stat shows:
- anon: ~35 MiB
- file: ~21.2 GiB
- inactive_file: ~21.1 GiB
ps aux --sort=-%mem shows no large memory-consuming processes

So the memory is not actually occupied by applications. It is mostly page cache.

Expected behavior: Ollama should either:

use MemAvailable instead of only MemFree, or
account for reclaimable file cache, or
provide an override / more aggressive loading option

Actual behavior: Ollama refuses to load the model even though Linux should be able to reclaim enough cache to make room.

Notes:

Restarting only the ollama service does not help
Restarting the whole container may temporarily help because cache is cleared
This looks especially relevant for containerized environments such as LXC

Relevant log output

Apr 19 17:31:46 ollama ollama[199195]: time=2026-04-19T17:31:46.564+02:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="32.0 GiB" available="12.1 GiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.309+02:00 level=INFO source=sched.go:484 msg="system memory" total="32.0 GiB" free="12.1 GiB" free_swap="506.2 MiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=WARN source=server.go:1058 msg="model request too large for system" requested="18.3 GiB" available="12.6 GiB" total="32.0 GiB" free="12.1 GiB" swap="506.2 MiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=device.go:245 msg="model weights" device=CPU size="17.3 GiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=device.go:256 msg="kv cache" device=CPU size="880.0 MiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=device.go:267 msg="compute graph" device=CPU size="133.7 MiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=device.go:272 msg="total memory" size="18.3 GiB"
Apr 19 17:32:08 ollama ollama[199195]: time=2026-04-19T17:32:08.568+02:00 level=INFO source=sched.go:511 msg="Load failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-7121486771cbfe218851513210c40b35dbdee93ab1ef43fe36283c883980f0df error="model requires more system memory (18.3 GiB) than is available (12.6 GiB)"
Apr 19 17:32:08 ollama ollama[199195]: [GIN] 2026/04/19 - 17:32:08 | 500 | 595.882098ms | 192.168.1.239 | POST "/api/chat"

Inside the container:

free -h
---------------
               total        used        free      shared  buff/cache   available
Mem:            32Gi        44Mi        12Gi        72Ki        19Gi        31Gi
Swap:          512Mi       5.8Mi       506Mi

/proc/meminfo
---------------
MemTotal:       33554432 kB
MemFree:        12726260 kB
MemAvailable:   33508662 kB
SwapTotal:        524288 kB
SwapFree:         518352 kB

/sys/fs/cgroup/memory.stat
---------------
anon 36884480
file 21266915328
inactive_file 21189849088
active_file 77045760

OS

Linux

GPU

No response

CPU

Intel

Ollama version

0.20.5

extent analysis

TL;DR

Ollama's memory check seems to ignore reclaimable page cache, causing it to refuse loading a model despite sufficient available memory.

Guidance

Verify memory usage: Confirm that the issue persists after running echo 3 > /proc/sys/vm/drop_caches to clear page cache and observe if Ollama can load the model afterward.
Check Ollama configuration: Investigate if there's a configuration option or environment variable that can adjust Ollama's memory checking behavior to account for reclaimable memory.
Monitor system memory: Use tools like sysdig or systemd-cgtop to monitor memory usage and cgroup limits in real-time, ensuring no other processes are consuming memory unexpectedly.
Consider a temporary workaround: If the model can fit into memory after clearing the cache, consider implementing a script to periodically clear the cache before loading the model, though this is not a permanent solution.

Example

No specific code example is provided due to the nature of the issue, which seems related to how Ollama interprets system memory availability rather than a code snippet that can be directly applied.

Notes

The behavior observed suggests a potential issue with how Ollama calculates available memory, specifically its consideration of reclaimable page cache. This might be a known issue in version 0.20.5, and checking for updates or filing a bug report might be necessary.

Recommendation

Apply a workaround, such as periodically clearing the page cache before attempting to load the model, until a more permanent solution or update to Ollama that correctly accounts for reclaimable memory is available.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #installation #tensor shape #autograd error #model save/load

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - ✅(Solved) Fix False "insufficient memory" error in LXC: Ollama appears to use MemFree instead of MemAvailable when loading models [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #15713: fix: use MemAvailable equivalent in cgroup memory check

Description (problem / solution / changelog)

Changed files

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - ✅(Solved) Fix False "insufficient memory" error in LXC: Ollama appears to use MemFree instead of MemAvailable when loading models [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #15713: fix: use MemAvailable equivalent in cgroup memory check

Description (problem / solution / changelog)

Changed files

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING