ollama - 💡(How to fix) Fix panic: mlx.Unpin: negative pin count on array "CONTIGUOUS" [1 comments, 1 participants]

ollama2026-04-05 12:37:48

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#15341•Fetched 2026-04-08 02:52:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

chigkim

Participants

chigkim

Timeline (top)

labeled ×3closed ×1commented ×1

Error Message

I tried to import a finetuned Qwen3.5-35b safetensor to mlx mxfp8, but I got the error.

Code Example

Screen 1:
GOMAXPROCS=1 ollama serve

Screen 2:
GOMAXPROCS=1 ollama create qwen3.5:35b-a3b-heretic-mxfp8 --experimental -f q35-35b.modelfile -q mxfp8
importing safetensors model
importing model-00001-of-00002.safetensors (22199 tensors, quantizing to mxfp8)
importing model-00002-of-00002.safetensors (9467 tensors, quantizing to mxfp8)
packing language_model.model.layers.0.mlp.experts (768 tensors)
panic: mlx.Unpin: negative pin count on array "CONTIGUOUS"

goroutine 1 [running, locked to thread]:
github.com/ollama/ollama/x/mlxrunner/mlx.Unpin({0x1400165e008?, 0x10500a404?, 0x140006e1ce0?})
        /Users/runner/work/ollama/ollama/x/mlxrunner/mlx/array.go:141 +0xa8
github.com/ollama/ollama/x/create/client.stackAndQuantizeExpertGroup.func1()
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:342 +0x2c
github.com/ollama/ollama/x/create/client.stackAndQuantizeExpertGroup({0x140004b8ab0, 0x29?}, 0x140030b8ba0, {0x1054c92b0, 0x5})
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:435 +0x984
github.com/ollama/ollama/x/create/client.quantizePackedGroup({0x140004b8ab0, 0x29}, {0x14000714000, 0x300, 0x14000044958?})
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:178 +0xd8
github.com/ollama/ollama/x/create/client.CreateModel.newPackedTensorLayerCreator.func5({0x140004b8ab0, 0x29}, {0x14000714000?, 0x2?, 0x3a2})
        /Users/runner/work/ollama/ollama/x/create/client/create.go:293 +0xc8
github.com/ollama/ollama/x/create.CreateSafetensorsModel({0x16bba785e, 0x1d}, {0x14000396340, 0x32}, {0x16bba78a3, 0x5}, 0x105a7c378, 0x105a7c390, 0x140000457c8, 0x14000045840, ...)
        /Users/runner/work/ollama/ollama/x/create/create.go:849 +0x950
github.com/ollama/ollama/x/create/client.CreateModel({{0x16bba785e, 0x1d}, {0x14000396340, 0x32}, {0x16bba78a3, 0x5}, 0x14000202000}, 0x1400046de40)
        /Users/runner/work/ollama/ollama/x/create/client/create.go:161 +0x2d8
github.com/ollama/ollama/cmd.CreateHandler(0x1400046f208, {0x14000202660, 0x1, 0x6?})
        /Users/runner/work/ollama/ollama/cmd/cmd.go:206 +0x874
github.com/spf13/cobra.(*Command).execute(0x1400046f208, {0x14000202600, 0x6, 0x6})
        /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x648
github.com/spf13/cobra.(*Command).ExecuteC(0x1400046ef08)
        /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320
github.com/spf13/cobra.(*Command).Execute(...)
        /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
        /Users/runner/work/ollama/ollama/main.go:12 +0x54

RAW_BUFFERClick to expand / collapse

What is the issue?

I tried to import a finetuned Qwen3.5-35b safetensor to mlx mxfp8, but I got the error. Importing to q8_0 worked fine.

Relevant log output

Screen 1:
GOMAXPROCS=1 ollama serve

Screen 2:
GOMAXPROCS=1 ollama create qwen3.5:35b-a3b-heretic-mxfp8 --experimental -f q35-35b.modelfile -q mxfp8
importing safetensors model
importing model-00001-of-00002.safetensors (22199 tensors, quantizing to mxfp8)
importing model-00002-of-00002.safetensors (9467 tensors, quantizing to mxfp8)
packing language_model.model.layers.0.mlp.experts (768 tensors)
panic: mlx.Unpin: negative pin count on array "CONTIGUOUS"

goroutine 1 [running, locked to thread]:
github.com/ollama/ollama/x/mlxrunner/mlx.Unpin({0x1400165e008?, 0x10500a404?, 0x140006e1ce0?})
        /Users/runner/work/ollama/ollama/x/mlxrunner/mlx/array.go:141 +0xa8
github.com/ollama/ollama/x/create/client.stackAndQuantizeExpertGroup.func1()
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:342 +0x2c
github.com/ollama/ollama/x/create/client.stackAndQuantizeExpertGroup({0x140004b8ab0, 0x29?}, 0x140030b8ba0, {0x1054c92b0, 0x5})
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:435 +0x984
github.com/ollama/ollama/x/create/client.quantizePackedGroup({0x140004b8ab0, 0x29}, {0x14000714000, 0x300, 0x14000044958?})
        /Users/runner/work/ollama/ollama/x/create/client/quantize.go:178 +0xd8
github.com/ollama/ollama/x/create/client.CreateModel.newPackedTensorLayerCreator.func5({0x140004b8ab0, 0x29}, {0x14000714000?, 0x2?, 0x3a2})
        /Users/runner/work/ollama/ollama/x/create/client/create.go:293 +0xc8
github.com/ollama/ollama/x/create.CreateSafetensorsModel({0x16bba785e, 0x1d}, {0x14000396340, 0x32}, {0x16bba78a3, 0x5}, 0x105a7c378, 0x105a7c390, 0x140000457c8, 0x14000045840, ...)
        /Users/runner/work/ollama/ollama/x/create/create.go:849 +0x950
github.com/ollama/ollama/x/create/client.CreateModel({{0x16bba785e, 0x1d}, {0x14000396340, 0x32}, {0x16bba78a3, 0x5}, 0x14000202000}, 0x1400046de40)
        /Users/runner/work/ollama/ollama/x/create/client/create.go:161 +0x2d8
github.com/ollama/ollama/cmd.CreateHandler(0x1400046f208, {0x14000202660, 0x1, 0x6?})
        /Users/runner/work/ollama/ollama/cmd/cmd.go:206 +0x874
github.com/spf13/cobra.(*Command).execute(0x1400046f208, {0x14000202600, 0x6, 0x6})
        /Users/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:940 +0x648
github.com/spf13/cobra.(*Command).ExecuteC(0x1400046ef08)
        /Users/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1068 +0x320
github.com/spf13/cobra.(*Command).Execute(...)
        /Users/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        /Users/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:985
main.main()
        /Users/runner/work/ollama/ollama/main.go:12 +0x54

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.20.2

extent analysis

TL;DR

The issue is likely due to a bug in the mlx.Unpin function, causing a panic with a negative pin count on an array, and a workaround may involve modifying the quantization process or updating the Ollama version.

Guidance

The error message panic: mlx.Unpin: negative pin count on array "CONTIGUOUS" suggests an issue with memory management in the mlx package.
The fact that importing to q8_0 worked fine, but not to mxfp8, implies that the problem may be related to the quantization process.
To mitigate the issue, try modifying the quantization parameters or updating the Ollama version to a newer release, if available.
Verify the issue by running the import process with different quantization settings or on a different environment.

Example

No code snippet is provided as the issue seems to be related to a specific library or package, and modifying the code without further information may not be safe.

Notes

The provided information suggests that the issue is specific to the mxfp8 quantization and the mlx package. Without more context or information about the mlx package, it's difficult to provide a more specific solution.

Recommendation

Apply workaround: Modify the quantization parameters or try a different quantization setting to see if the issue persists. This may help identify if the problem is specific to the mxfp8 quantization or a more general issue with the mlx package.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#index setup #retrieval issue #search optimization #API routing #API middleware

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix panic: mlx.Unpin: negative pin count on array "CONTIGUOUS" [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix panic: mlx.Unpin: negative pin count on array "CONTIGUOUS" [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING