ollama - 💡(How to fix) Fix unknown model architecture: 'qwen35moe' when loading imported GGUF with mmproj (vision projector) [1 participants]

mirifiuto135-debug · 2026-03-09T10:05:43Z

[ollama] Imported Qwen3.5-35B-A3B GGUF models fail to load when a vision projector mmproj file is attached. The same model loads fine for text-only without mmp… Imported Qwen3.5-35B-A3B GGUF models fail to load when a vision projector (mmproj) file is attached. The same model loads fine for text-only (without mmproj), and loads fine with mmproj via llama.cpp's --mmproj flag. Ollama version 0.17.7 Steps to reproduce 1. Download a community Qwen 3.5 GGUF (e.g., from llmfan46/Qwen3.5-35B-A3B-heretic-v2-GGUF) and its mmproj file (Qwen3.5-35B-A3B-mmproj-BF16.gguf) 2. Create a Modelfile: FROM Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf FROM Qwen3.5-35B-A3B-mmproj-BF16.gguf TEMPLATE """{{ .Prompt }}""" 3. ollama create qwen3.5:test -f Modelfile → succeeds 4. ollama run qwen3.5:test → fails Also tried ADAPTER instead of second FROM — same result. Error llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe' Expected behavior The model should load with vision support, same as it does with llama.cpp: llama-server -m Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf --mmproj Qwen3.5-35B-A3B-mmproj-BF16.gguf -c 4096 This works perfectly — text and vision both functional. Notes - Without mmproj, the model loads fine for text (families: ['qwen35moe']) - With mmproj, families becomes ['qwen35moe', 'clip'] and loading fails - The official qwen3.5:35b works with vision because it has native qwen35moe.vision.* tensors embedded in the main GGUF — no clip involved - PR #14517 fixed text-only loading of imported qwen35moe GGUFs but the multimodal/clip runner path was not updated for this architecture - GPU: 2x RTX 5060 16GB Description Imported Qwen3.5-35B-A3B GGUF models fail to load when a vision projector (mmproj) file is attached. The same model loads fine for text-only (without mmproj), and loads fine with mmproj via llama.cpp's --mmproj flag. Ollama version 0.17.7 Steps to reproduce 1. Download a community Qwen 3.5 GGUF (e.g., from llmfan46/Qwen3.5-35B-A3B-heretic-v2-GGUF) and its mmproj file (Qwen3.5-35B-A3B-mmproj-BF16.gguf) 2. Create a Modelfile: FROM Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf FROM Qwen3.5-35B-A3B-mmproj-BF16.gguf TEMPLATE """{{ .Prompt }}""" 3. ollama create qwen3.5:test -f Modelfile → succeeds 4. ollama run qwen3.5:test → fails Also tried ADAPTER instead of second FROM — same result. Error llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe' Expected behavior The model should load with vision support, same as it does with llama.cpp: llama-server -m Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf --mmproj Qwen3.5-35B-A3B-mmproj-BF16.gguf -c 4096 This works perfectly — text and vision both functional. Notes - Without mmproj, the model loads fine for text (families: ['qwen35moe']) - With mmproj, families becomes ['qwen35moe', 'clip'] and loading fails - The official qwen3.5:35b works with vision because it has native qwen35moe.vision.* tensors embedded in the main GGUF — no clip involved - PR #14517 fixed text-only loading of imported qwen35moe GGUFs but the multimodal/clip runner path was not updated for this architecture - GPU: 2x RTX 5060 16GB

ollama2026-03-09 10:05:43

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#14730•Fetched 2026-04-08 00:32:25

View on GitHub

Comments

Participants

Timeline

Reactions

Author

mirifiuto135-debug

Participants

mirifiuto135-debug

Timeline (top)

closed ×1

Imported Qwen3.5-35B-A3B GGUF models fail to load when a vision projector (mmproj) file is attached. The same model loads fine for text-only (without mmproj), and loads fine with mmproj via llama.cpp's --mmproj flag.

Ollama version

0.17.7

Steps to reproduce

Download a community Qwen 3.5 GGUF (e.g., from llmfan46/Qwen3.5-35B-A3B-heretic-v2-GGUF) and its mmproj file (Qwen3.5-35B-A3B-mmproj-BF16.gguf)
Create a Modelfile: FROM Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf FROM Qwen3.5-35B-A3B-mmproj-BF16.gguf TEMPLATE """{{ .Prompt }}"""
ollama create qwen3.5:test -f Modelfile → succeeds
ollama run qwen3.5:test → fails

Also tried ADAPTER instead of second FROM — same result.

Error

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe'

Expected behavior

The model should load with vision support, same as it does with llama.cpp: llama-server -m Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf --mmproj Qwen3.5-35B-A3B-mmproj-BF16.gguf -c 4096 This works perfectly — text and vision both functional.

Notes

Without mmproj, the model loads fine for text (families: ['qwen35moe'])
With mmproj, families becomes ['qwen35moe', 'clip'] and loading fails
The official qwen3.5:35b works with vision because it has native qwen35moe.vision.* tensors embedded in the main GGUF — no clip involved
PR #14517 fixed text-only loading of imported qwen35moe GGUFs but the multimodal/clip runner path was not updated for this architecture
GPU: 2x RTX 5060 16GB

Error Message

Error llama_model_load: error loading model: error loading model architecture: unknown

Root Cause

Without mmproj, the model loads fine for text (families: ['qwen35moe'])
- With mmproj, families becomes ['qwen35moe', 'clip'] and loading fails
- The official qwen3.5:35b works with vision because it has native qwen35moe.vision.* tensors embedded in the main GGUF — no clip involved
- PR #14517 fixed text-only loading of imported qwen35moe GGUFs but the multimodal/clip runner path was not updated for this architecture
- GPU: 2x RTX 5060 16GB

RAW_BUFFERClick to expand / collapse

Description

Ollama version

0.17.7

Steps to reproduce

Download a community Qwen 3.5 GGUF (e.g., from llmfan46/Qwen3.5-35B-A3B-heretic-v2-GGUF) and its mmproj file (Qwen3.5-35B-A3B-mmproj-BF16.gguf)
Create a Modelfile: FROM Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf FROM Qwen3.5-35B-A3B-mmproj-BF16.gguf TEMPLATE """{{ .Prompt }}"""
ollama create qwen3.5:test -f Modelfile → succeeds
ollama run qwen3.5:test → fails

Also tried ADAPTER instead of second FROM — same result.

Error

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe'

Expected behavior

Notes

Without mmproj, the model loads fine for text (families: ['qwen35moe'])
With mmproj, families becomes ['qwen35moe', 'clip'] and loading fails
The official qwen3.5:35b works with vision because it has native qwen35moe.vision.* tensors embedded in the main GGUF — no clip involved
PR #14517 fixed text-only loading of imported qwen35moe GGUFs but the multimodal/clip runner path was not updated for this architecture
GPU: 2x RTX 5060 16GB

extent analysis

Fix Plan

To fix the issue of loading Qwen3.5-35B-A3B GGUF models with vision projector (mmproj) files, we need to update the multimodal/clip runner path for the 'qwen35moe' architecture.

Steps to Fix

Update the ollama configuration to include the 'clip' family for the 'qwen35moe' architecture.
Modify the model loading code to handle the 'clip' family for multimodal models.

Example Code

# Update the model loading code to handle 'clip' family
def load_model(model_name, mmproj_file):
    # ... existing code ...
    if 'qwen35moe' in model_families and mmproj_file:
        model_families.append('clip')
    # ... existing code ...

# Update the ollama configuration
ollama_config = {
    # ... existing config ...
    'model_families': {
        'qwen35moe': ['qwen35moe', 'clip']
    }
}

Verification

To verify the fix, run the following commands:

ollama create qwen3.5:test -f Modelfile
ollama run qwen3.5:test

The model should now load with vision support, and both text and vision functionality should work as expected.

Extra Tips

Make sure to update the ollama version to the latest release to ensure the fix is included.
If issues persist, try resetting the ollama configuration to its default state and re-applying the fix.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #inference speed #output truncation #response parsing #generation error #database connection

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix unknown model architecture: 'qwen35moe' when loading imported GGUF with mmproj (vision projector) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

extent analysis

Fix Plan

Steps to Fix

Example Code

Verification

Extra Tips

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix unknown model architecture: 'qwen35moe' when loading imported GGUF with mmproj (vision projector) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

extent analysis

Fix Plan

Steps to Fix

Example Code

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING