ollama - 💡(How to fix) Fix unknown model architecture: 'qwen35moe' when loading imported GGUF with mmproj (vision projector) [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#14730Fetched 2026-04-08 00:32:25
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Participants
Timeline (top)
closed ×1

Imported Qwen3.5-35B-A3B GGUF models fail to load when a vision projector (mmproj) file is attached. The same model loads fine for text-only (without mmproj), and loads fine with mmproj via llama.cpp's --mmproj flag.

Ollama version

0.17.7

Steps to reproduce

  1. Download a community Qwen 3.5 GGUF (e.g., from llmfan46/Qwen3.5-35B-A3B-heretic-v2-GGUF) and its mmproj file (Qwen3.5-35B-A3B-mmproj-BF16.gguf)
  2. Create a Modelfile: FROM Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf FROM Qwen3.5-35B-A3B-mmproj-BF16.gguf TEMPLATE """{{ .Prompt }}"""
  3. ollama create qwen3.5:test -f Modelfile → succeeds
  4. ollama run qwen3.5:test → fails

Also tried ADAPTER instead of second FROM — same result.

Error

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe'

Expected behavior

The model should load with vision support, same as it does with llama.cpp: llama-server -m Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf --mmproj Qwen3.5-35B-A3B-mmproj-BF16.gguf -c 4096 This works perfectly — text and vision both functional.

Notes

  • Without mmproj, the model loads fine for text (families: ['qwen35moe'])
  • With mmproj, families becomes ['qwen35moe', 'clip'] and loading fails
  • The official qwen3.5:35b works with vision because it has native qwen35moe.vision.* tensors embedded in the main GGUF — no clip involved
  • PR #14517 fixed text-only loading of imported qwen35moe GGUFs but the multimodal/clip runner path was not updated for this architecture
  • GPU: 2x RTX 5060 16GB

Error Message

Error llama_model_load: error loading model: error loading model architecture: unknown

Root Cause

  • Without mmproj, the model loads fine for text (families: ['qwen35moe'])
    • With mmproj, families becomes ['qwen35moe', 'clip'] and loading fails
    • The official qwen3.5:35b works with vision because it has native qwen35moe.vision.* tensors embedded in the main GGUF — no clip involved
    • PR #14517 fixed text-only loading of imported qwen35moe GGUFs but the multimodal/clip runner path was not updated for this architecture
    • GPU: 2x RTX 5060 16GB
RAW_BUFFERClick to expand / collapse

Description

Imported Qwen3.5-35B-A3B GGUF models fail to load when a vision projector (mmproj) file is attached. The same model loads fine for text-only (without mmproj), and loads fine with mmproj via llama.cpp's --mmproj flag.

Ollama version

0.17.7

Steps to reproduce

  1. Download a community Qwen 3.5 GGUF (e.g., from llmfan46/Qwen3.5-35B-A3B-heretic-v2-GGUF) and its mmproj file (Qwen3.5-35B-A3B-mmproj-BF16.gguf)
  2. Create a Modelfile: FROM Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf FROM Qwen3.5-35B-A3B-mmproj-BF16.gguf TEMPLATE """{{ .Prompt }}"""
  3. ollama create qwen3.5:test -f Modelfile → succeeds
  4. ollama run qwen3.5:test → fails

Also tried ADAPTER instead of second FROM — same result.

Error

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe'

Expected behavior

The model should load with vision support, same as it does with llama.cpp: llama-server -m Qwen3.5-35B-A3B-heretic-v2-Q5_K_M.gguf --mmproj Qwen3.5-35B-A3B-mmproj-BF16.gguf -c 4096 This works perfectly — text and vision both functional.

Notes

  • Without mmproj, the model loads fine for text (families: ['qwen35moe'])
  • With mmproj, families becomes ['qwen35moe', 'clip'] and loading fails
  • The official qwen3.5:35b works with vision because it has native qwen35moe.vision.* tensors embedded in the main GGUF — no clip involved
  • PR #14517 fixed text-only loading of imported qwen35moe GGUFs but the multimodal/clip runner path was not updated for this architecture
  • GPU: 2x RTX 5060 16GB

extent analysis

Fix Plan

To fix the issue of loading Qwen3.5-35B-A3B GGUF models with vision projector (mmproj) files, we need to update the multimodal/clip runner path for the 'qwen35moe' architecture.

Steps to Fix

  • Update the ollama configuration to include the 'clip' family for the 'qwen35moe' architecture.
  • Modify the model loading code to handle the 'clip' family for multimodal models.

Example Code

# Update the model loading code to handle 'clip' family
def load_model(model_name, mmproj_file):
    # ... existing code ...
    if 'qwen35moe' in model_families and mmproj_file:
        model_families.append('clip')
    # ... existing code ...

# Update the ollama configuration
ollama_config = {
    # ... existing config ...
    'model_families': {
        'qwen35moe': ['qwen35moe', 'clip']
    }
}

Verification

To verify the fix, run the following commands:

  1. ollama create qwen3.5:test -f Modelfile
  2. ollama run qwen3.5:test

The model should now load with vision support, and both text and vision functionality should work as expected.

Extra Tips

  • Make sure to update the ollama version to the latest release to ensure the fix is included.
  • If issues persist, try resetting the ollama configuration to its default state and re-applying the fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix unknown model architecture: 'qwen35moe' when loading imported GGUF with mmproj (vision projector) [1 participants]