ollama - ✅(Solved) Fix Quantization bugs [1 pull requests, 1 participants]

ollama2026-05-01 22:01:22

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15925•Fetched 2026-05-02 05:27:36

View on GitHub

Comments

Participants

Timeline

Reactions

Author

levicki

Participants

levicki

Timeline (top)

cross-referenced ×1labeled ×1

Root Cause

Import all model files into blobs folder
Merge layers to single GGUF (58.3 GB)
Only at the very end fail the quantization because Q5_K_M is not supported
Leave all the files it created in blobs folder

PR fix notes

PR #15927: fix(server/create): validate quantize up-front before importing blobs

Repository: ollama/ollama
Author: SAY-5
State: open | merged: False
Link: https://github.com/ollama/ollama/pull/15927

Description (problem / solution / changelog)

Closes #15925.

CreateHandler accepted any string as the quantization argument and deferred validation until quantizeLayer -> ggml.ParseFileType ran at the end of the create flow, after all source files had already been imported into blobs/ and merged into a single GGUF. A user who typed Q5_K_M instead of Q4_K_M therefore lost ~116GB of disk writes and several minutes of CPU before seeing unsupported quantization type Q5_K_M, with the orphan blobs left behind.

Patch

Run the same ggml.ParseFileType check next to the existing r.Model / r.Files validations in CreateHandler so a typo returns HTTP 400 before any I/O. The downstream quantizeLayer call still runs the same parse, so the success path is unchanged.

Verification

go vet ./server/
go test ./server/ -run TestCreate

both clean.

Reproduction (after the fix):

$ curl -X POST http://localhost:11434/api/create -d '{"model":"my-q5", "from":"safetensors-path", "quantize":"Q5_K_M"}'
{"error":"unsupported quantization type Q5_K_M - supported types are F32, F16, Q4_K_S, Q4_K_M, Q8_0"}

No blobs imported, no GGUF merge attempted.

Changed files

server/create.go (modified, +9/-0)

RAW_BUFFERClick to expand / collapse

What is the issue?

I tried creating a model from safetensors with quantization.

I accidentally typed Q5_K_M instead of Q4_K_M.

ollama.exe proceeded to:

Import all model files into blobs folder
Merge layers to single GGUF (58.3 GB)
Only at the very end fail the quantization because Q5_K_M is not supported
Leave all the files it created in blobs folder

TL;DR — It wasted total of 116.6 GB of space and NVME disk writes, and about 5 minutes of CPU churning and RAM paging only to tell me "I can't do that Dave" and leave the mess behind.

And why?

Because someone wrote the code that:

does not validate command-line arguments up front
does not use a temp folder nor delete-on-close semantics for files that should not persist

Finally, there's no option to do the simplest housekeeping on blobs folder:

Enumerate manifests
Build a list of hashes
Enumerate blobs to find files
Remove all files found in manifests from the list
Present the remaining files and their sizes to user
Offer the user to delete them

Claude 4.6 (free version) wrote this workflow in Python for me in less than 10 seconds, I see no good reason not to have this as a command line option (for example ollama.exe gc).

Relevant log output

OS

Windows 10

GPU

RTX 5090

CPU

Xeon w5-2455X

Ollama version

0.22.1

extent analysis

TL;DR

Implement input validation for command-line arguments and consider adding a temporary folder with delete-on-close semantics to prevent unnecessary file creation and disk usage.

Guidance

Validate command-line arguments at the beginning of the process to prevent unnecessary computations and file creations.
Consider implementing a temporary folder with delete-on-close semantics to automatically remove files that should not persist.
Add a command-line option (e.g., ollama.exe gc) to perform housekeeping on the blobs folder, allowing users to easily remove unnecessary files.
When implementing the housekeeping option, follow the suggested steps: enumerate manifests, build a list of hashes, enumerate blobs, remove files found in manifests, and present the remaining files and their sizes to the user for potential deletion.

Example

No code example is provided due to the lack of specific implementation details in the issue.

Notes

The suggested solution focuses on improving the input validation and file management within the ollama.exe tool. The exact implementation details may vary depending on the tool's internal architecture and the programming language used.

Recommendation

Apply workaround: Implement input validation and consider adding a temporary folder with delete-on-close semantics to prevent unnecessary file creation and disk usage. This approach addresses the immediate issue without requiring an upgrade to a potentially non-existent fixed version.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#ISR setup #authentication setup #request error #file not found #serialization error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - ✅(Solved) Fix Quantization bugs [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

PR fix notes

PR #15927: fix(server/create): validate quantize up-front before importing blobs

Description (problem / solution / changelog)

Patch

Verification

Changed files

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - ✅(Solved) Fix Quantization bugs [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

PR fix notes

PR #15927: fix(server/create): validate quantize up-front before importing blobs

Description (problem / solution / changelog)

Patch

Verification

Changed files

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING