ollama - 💡(How to fix) Fix panic: mlx.Unpin: negative pin count on array "CONTIGUOUS" [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#15341Fetched 2026-04-08 02:52:31
View on GitHub
Comments
1
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×3closed ×1commented ×1

Error Message

I tried to import a finetuned Qwen3.5-35b safetensor to mlx mxfp8, but I got the error.

Code Example

Screen 1:
GOMAXPROCS=1 ollama serve

Screen 2:
GOMAXPROCS=1 ollama create qwen3.5:35b-a3b-heretic-mxfp8 --experimental -f q35-35b.modelfile -q mxfp8
importing safetensors model
importing model-00001-of-00002.safetensors (22199 tensors, quantizing to mxfp8)
importing model-00002-of-00002.safetensors (9467 tensors, quantizing to mxfp8)
packing language_model.model.layers.0.mlp.experts (768 tensors)
panic: mlx.Unpin: negative pin count on array "CONTIGUOUS"

goroutine 1 [running, locked to thread]:
github.com/ollama/ollama/x/mlxrunner/mlx.Unpin({0x1400165e008?, 0x10500a404?, 0x140006e1ce0?})
        /Users/runner/work/ollama/ollama/x/mlxrunner/mlx/array.go:141 +0xa8
github.com/ollama/ollama/x/create/client.stackAndQuantizeExpertGroup.func1()
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:342 +0x2c
github.com/ollama/ollama/x/create/client.stackAndQuantizeExpertGroup({0x140004b8ab0, 0x29?}, 0x140030b8ba0, {0x1054c92b0, 0x5})
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:435 +0x984
github.com/ollama/ollama/x/create/client.quantizePackedGroup({0x140004b8ab0, 0x29}, {0x14000714000, 0x300, 0x14000044958?})
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:178 +0xd8
github.com/ollama/ollama/x/create/client.CreateModel.newPackedTensorLayerCreator.func5({0x140004b8ab0, 0x29}, {0x14000714000?, 0x2?, 0x3a2})
        /Users/runner/work/ollama/ollama/x/create/client/create.go:293 +0xc8
github.com/ollama/ollama/x/create.CreateSafetensorsModel({0x16bba785e, 0x1d}, {0x14000396340, 0x32}, {0x16bba78a3, 0x5}, 0x105a7c378, 0x105a7c390, 0x140000457c8, 0x14000045840, ...)
        /Users/runner/work/ollama/ollama/x/create/create.go:849 +0x950
github.com/ollama/ollama/x/create/client.CreateModel({{0x16bba785e, 0x1d}, {0x14000396340, 0x32}, {0x16bba78a3, 0x5}, 0x14000202000}, 0x1400046de40)
        /Users/runner/work/ollama/ollama/x/create/client/create.go:161 +0x2d8
github.com/ollama/ollama/cmd.CreateHandler(0x1400046f208, {0x14000202660, 0x1, 0x6?})
        /Users/runner/work/ollama/ollama/cmd/cmd.go:206 +0x874
github.com/spf13/cobra.(*Command).execute(0x1400046f208, {0x14000202600, 0x6, 0x6})
        /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x648
github.com/spf13/cobra.(*Command).ExecuteC(0x1400046ef08)
        /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320
github.com/spf13/cobra.(*Command).Execute(...)
        /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
        /Users/runner/work/ollama/ollama/main.go:12 +0x54
RAW_BUFFERClick to expand / collapse

What is the issue?

I tried to import a finetuned Qwen3.5-35b safetensor to mlx mxfp8, but I got the error. Importing to q8_0 worked fine.

Relevant log output

Screen 1:
GOMAXPROCS=1 ollama serve

Screen 2:
GOMAXPROCS=1 ollama create qwen3.5:35b-a3b-heretic-mxfp8 --experimental -f q35-35b.modelfile -q mxfp8
importing safetensors model
importing model-00001-of-00002.safetensors (22199 tensors, quantizing to mxfp8)
importing model-00002-of-00002.safetensors (9467 tensors, quantizing to mxfp8)
packing language_model.model.layers.0.mlp.experts (768 tensors)
panic: mlx.Unpin: negative pin count on array "CONTIGUOUS"

goroutine 1 [running, locked to thread]:
github.com/ollama/ollama/x/mlxrunner/mlx.Unpin({0x1400165e008?, 0x10500a404?, 0x140006e1ce0?})
        /Users/runner/work/ollama/ollama/x/mlxrunner/mlx/array.go:141 +0xa8
github.com/ollama/ollama/x/create/client.stackAndQuantizeExpertGroup.func1()
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:342 +0x2c
github.com/ollama/ollama/x/create/client.stackAndQuantizeExpertGroup({0x140004b8ab0, 0x29?}, 0x140030b8ba0, {0x1054c92b0, 0x5})
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:435 +0x984
github.com/ollama/ollama/x/create/client.quantizePackedGroup({0x140004b8ab0, 0x29}, {0x14000714000, 0x300, 0x14000044958?})
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:178 +0xd8
github.com/ollama/ollama/x/create/client.CreateModel.newPackedTensorLayerCreator.func5({0x140004b8ab0, 0x29}, {0x14000714000?, 0x2?, 0x3a2})
        /Users/runner/work/ollama/ollama/x/create/client/create.go:293 +0xc8
github.com/ollama/ollama/x/create.CreateSafetensorsModel({0x16bba785e, 0x1d}, {0x14000396340, 0x32}, {0x16bba78a3, 0x5}, 0x105a7c378, 0x105a7c390, 0x140000457c8, 0x14000045840, ...)
        /Users/runner/work/ollama/ollama/x/create/create.go:849 +0x950
github.com/ollama/ollama/x/create/client.CreateModel({{0x16bba785e, 0x1d}, {0x14000396340, 0x32}, {0x16bba78a3, 0x5}, 0x14000202000}, 0x1400046de40)
        /Users/runner/work/ollama/ollama/x/create/client/create.go:161 +0x2d8
github.com/ollama/ollama/cmd.CreateHandler(0x1400046f208, {0x14000202660, 0x1, 0x6?})
        /Users/runner/work/ollama/ollama/cmd/cmd.go:206 +0x874
github.com/spf13/cobra.(*Command).execute(0x1400046f208, {0x14000202600, 0x6, 0x6})
        /Users/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:940 +0x648
github.com/spf13/cobra.(*Command).ExecuteC(0x1400046ef08)
        /Users/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1068 +0x320
github.com/spf13/cobra.(*Command).Execute(...)
        /Users/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        /Users/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:985
main.main()
        /Users/runner/work/ollama/ollama/main.go:12 +0x54

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.20.2

extent analysis

TL;DR

The issue is likely due to a bug in the mlx.Unpin function, causing a panic with a negative pin count on an array, and a workaround may involve modifying the quantization process or updating the Ollama version.

Guidance

  • The error message panic: mlx.Unpin: negative pin count on array "CONTIGUOUS" suggests an issue with memory management in the mlx package.
  • The fact that importing to q8_0 worked fine, but not to mxfp8, implies that the problem may be related to the quantization process.
  • To mitigate the issue, try modifying the quantization parameters or updating the Ollama version to a newer release, if available.
  • Verify the issue by running the import process with different quantization settings or on a different environment.

Example

No code snippet is provided as the issue seems to be related to a specific library or package, and modifying the code without further information may not be safe.

Notes

The provided information suggests that the issue is specific to the mxfp8 quantization and the mlx package. Without more context or information about the mlx package, it's difficult to provide a more specific solution.

Recommendation

Apply workaround: Modify the quantization parameters or try a different quantization setting to see if the issue persists. This may help identify if the problem is specific to the mxfp8 quantization or a more general issue with the mlx package.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING