ollama - 💡(How to fix) Fix CRITICAL: GGUF Models Have Corrupted F32 Norm Weights (upstream llama.cpp bug) [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15908Fetched 2026-05-01 05:33:21
View on GitHub
Comments
1
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
closed ×1commented ×1

We discovered that F32 norm weights in GGUF files are corrupted during SafeTensors→GGUF conversion. This affects all Ollama models because Ollama uses GGUF format.

Root Cause

We discovered that F32 norm weights in GGUF files are corrupted during SafeTensors→GGUF conversion. This affects all Ollama models because Ollama uses GGUF format.

RAW_BUFFERClick to expand / collapse

Upstream Bug Affecting All Ollama Models

Upstream Issue: https://github.com/ggml-org/llama.cpp/issues/22565

Summary

We discovered that F32 norm weights in GGUF files are corrupted during SafeTensors→GGUF conversion. This affects all Ollama models because Ollama uses GGUF format.

Confirmed Affected Models

  • Qwen2.5-3B: Norm weights 1.18x-4x off from SafeTensors
  • Gemma 4-E4B: Norm weights 0.31x-9.58x off from SafeTensors

Likely affects all models in Ollama library.

Impact

  • ✅ Models run without crashing
  • Significantly degraded output quality (token predictions 3-10x off)
  • ❌ Users blame 'model quality' when it's actually file corruption

Reproduction

See full details in upstream issue: https://github.com/ggml-org/llama.cpp/issues/22565

We compared GGUF F32 weights to original SafeTensors numerically and found systematic corruption in RMSNorm weights.

Recommended Actions

  1. Acknowledge issue - Users should know current models may be corrupted
  2. Monitor upstream fix - llama.cpp team needs to fix conversion
  3. Re-publish models - Once fixed, regenerate all GGUFs

Priority

P0 CRITICAL - Affects entire Ollama library and millions of users.


Discovered while debugging inference engine. Expected logit 19.46, got 6.44. Root cause: GGUF weights don't match ground truth.

extent analysis

TL;DR

The most likely fix is to wait for the upstream fix in llama.cpp and then re-publish all Ollama models with corrected GGUF files.

Guidance

  • Monitor the upstream issue https://github.com/ggml-org/llama.cpp/issues/22565 for a fix to the SafeTensors→GGUF conversion corruption.
  • Acknowledge the issue to users, as current models may be corrupted, affecting output quality.
  • Prepare to re-publish all Ollama models once the upstream fix is available, to ensure corrected GGUF files are used.
  • Verify the fix by comparing the norm weights of the re-published GGUF files to the original SafeTensors, ensuring they match within an acceptable margin.

Example

No code snippet is provided as it is not explicitly supported by the issue.

Notes

This solution relies on the upstream fix being implemented and released. The exact timeline for this fix is uncertain and depends on the llama.cpp team's progress.

Recommendation

Apply workaround: Wait for the upstream fix and re-publish models, as this is the most reliable way to ensure corrected GGUF files are used, addressing the significant degradation in output quality.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix CRITICAL: GGUF Models Have Corrupted F32 Norm Weights (upstream llama.cpp bug) [1 comments, 1 participants]