ollama - ✅(Solved) Fix Quantization bugs [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15925Fetched 2026-05-02 05:27:36
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1labeled ×1

Root Cause

  1. Import all model files into blobs folder
  2. Merge layers to single GGUF (58.3 GB)
  3. Only at the very end fail the quantization because Q5_K_M is not supported
  4. Leave all the files it created in blobs folder

PR fix notes

PR #15927: fix(server/create): validate quantize up-front before importing blobs

Description (problem / solution / changelog)

Closes #15925.

CreateHandler accepted any string as the quantization argument and deferred validation until quantizeLayer -> ggml.ParseFileType ran at the end of the create flow, after all source files had already been imported into blobs/ and merged into a single GGUF. A user who typed Q5_K_M instead of Q4_K_M therefore lost ~116GB of disk writes and several minutes of CPU before seeing unsupported quantization type Q5_K_M, with the orphan blobs left behind.

Patch

Run the same ggml.ParseFileType check next to the existing r.Model / r.Files validations in CreateHandler so a typo returns HTTP 400 before any I/O. The downstream quantizeLayer call still runs the same parse, so the success path is unchanged.

Verification

go vet ./server/
go test ./server/ -run TestCreate

both clean.

Reproduction (after the fix):

$ curl -X POST http://localhost:11434/api/create -d '{"model":"my-q5", "from":"safetensors-path", "quantize":"Q5_K_M"}'
{"error":"unsupported quantization type Q5_K_M - supported types are F32, F16, Q4_K_S, Q4_K_M, Q8_0"}

No blobs imported, no GGUF merge attempted.

Changed files

  • server/create.go (modified, +9/-0)
RAW_BUFFERClick to expand / collapse

What is the issue?

I tried creating a model from safetensors with quantization.

I accidentally typed Q5_K_M instead of Q4_K_M.

ollama.exe proceeded to:

  1. Import all model files into blobs folder
  2. Merge layers to single GGUF (58.3 GB)
  3. Only at the very end fail the quantization because Q5_K_M is not supported
  4. Leave all the files it created in blobs folder

TL;DR — It wasted total of 116.6 GB of space and NVME disk writes, and about 5 minutes of CPU churning and RAM paging only to tell me "I can't do that Dave" and leave the mess behind.

And why?

Because someone wrote the code that:

  • does not validate command-line arguments up front
  • does not use a temp folder nor delete-on-close semantics for files that should not persist

Finally, there's no option to do the simplest housekeeping on blobs folder:

  • Enumerate manifests
  • Build a list of hashes
  • Enumerate blobs to find files
  • Remove all files found in manifests from the list
  • Present the remaining files and their sizes to user
  • Offer the user to delete them

Claude 4.6 (free version) wrote this workflow in Python for me in less than 10 seconds, I see no good reason not to have this as a command line option (for example ollama.exe gc).

Relevant log output

OS

Windows 10

GPU

RTX 5090

CPU

Xeon w5-2455X

Ollama version

0.22.1

extent analysis

TL;DR

Implement input validation for command-line arguments and consider adding a temporary folder with delete-on-close semantics to prevent unnecessary file creation and disk usage.

Guidance

  • Validate command-line arguments at the beginning of the process to prevent unnecessary computations and file creations.
  • Consider implementing a temporary folder with delete-on-close semantics to automatically remove files that should not persist.
  • Add a command-line option (e.g., ollama.exe gc) to perform housekeeping on the blobs folder, allowing users to easily remove unnecessary files.
  • When implementing the housekeeping option, follow the suggested steps: enumerate manifests, build a list of hashes, enumerate blobs, remove files found in manifests, and present the remaining files and their sizes to the user for potential deletion.

Example

No code example is provided due to the lack of specific implementation details in the issue.

Notes

The suggested solution focuses on improving the input validation and file management within the ollama.exe tool. The exact implementation details may vary depending on the tool's internal architecture and the programming language used.

Recommendation

Apply workaround: Implement input validation and consider adding a temporary folder with delete-on-close semantics to prevent unnecessary file creation and disk usage. This approach addresses the immediate issue without requiring an upgrade to a potentially non-existent fixed version.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - ✅(Solved) Fix Quantization bugs [1 pull requests, 1 participants]