hermes - 💡(How to fix) Fix [Bug]: `vision_analyze` fails on Ollama Cloud because vision auto-detection falls through to non-vision models [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14744Fetched 2026-04-24 06:14:54
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
1
Author
Participants
Timeline (top)
labeled ×5commented ×1

Error Message

When using ollama-cloud as the main provider with a vision-capable model (kimi-k2.6), the vision_analyze tool fails with Error code: 400 - this model does not support image input.

Additional Logs / Traceback (optional)

Root Cause

In agent/auxiliary_client.py, line ~2059:

  1. main_provider = "ollama-cloud"
  2. main_model = "minimax-m2.7" (from config.yaml model.default)
  3. vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model) -> "minimax-m2.7" (because ollama-cloud is not in _PROVIDER_VISION_MODELS)
  4. The non-vision model minimax-m2.7 gets used for the vision API call
  5. Ollama Cloud returns 400 because minimax-m2.7 does not support image input

Fix Action

Fix / Workaround

Current Workaround

  • I can reproduce this consistently with the steps above
  • I have tested the workaround
  • I have searched existing issues for duplicates

Code Example

model:
  api_key: ollama
  base_url: https://ollama.com/v1
  default: minimax-m2.7    # <-- This is what gets used for vision
  provider: ollama-cloud

auxiliary:
  vision:
    provider: auto         # DEFAULT — causes the bug
    model: ''

---

auxiliary:
  vision:
    provider: ollama-cloud
    model: kimi-k2.6

---

N/A

---
RAW_BUFFERClick to expand / collapse

Bug Description

When using ollama-cloud as the main provider with a vision-capable model (kimi-k2.6), the vision_analyze tool fails with Error code: 400 - this model does not support image input.

The root cause is in resolve_vision_provider_client() in agent/auxiliary_client.py. When the main provider (ollama-cloud) is NOT in _VISION_AUTO_PROVIDER_ORDER (which only contains ("openrouter", "nous")), the code tries resolve_provider_client(main_provider, vision_model) at line ~2060. It then uses whatever model is configured as default in config.yaml instead of the actually active model.

Environment

  • Hermes Agent v0.10.0 (2026.4.16)
  • macOS
  • Provider: ollama-cloud
  • Main model: kimi-k2.6
  • Config default: minimax-m2.7 (a text-only model)

Relevant Config

model:
  api_key: ollama
  base_url: https://ollama.com/v1
  default: minimax-m2.7    # <-- This is what gets used for vision
  provider: ollama-cloud

auxiliary:
  vision:
    provider: auto         # DEFAULT — causes the bug
    model: ''

Root Cause Analysis

In agent/auxiliary_client.py, line ~2059:

  1. main_provider = "ollama-cloud"
  2. main_model = "minimax-m2.7" (from config.yaml model.default)
  3. vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model) -> "minimax-m2.7" (because ollama-cloud is not in _PROVIDER_VISION_MODELS)
  4. The non-vision model minimax-m2.7 gets used for the vision API call
  5. Ollama Cloud returns 400 because minimax-m2.7 does not support image input

Evidence

Direct API tests with openai SDK against https://ollama.com/v1:

ModelImage support
minimax-m2.7 (config default)FAIL: "this model does not support image input"
kimi-k2.6 (actually used)SUCCESS

The user's main chat model is kimi-k2.6 (as shown in models_dev_cache.json and the active model header), but _read_main_model() reads config.yaml's model.default: minimax-m2.7.

Expected Behavior

The vision tool should respect the ACTUAL currently active model, not the static model.default from config. When a user has switched to kimi-k2.6 via hermes model, vision should use kimi-k2.6.

Current Workaround

Explicitly configure vision auxiliary to use the correct model:

auxiliary:
  vision:
    provider: ollama-cloud
    model: kimi-k2.6

However, this is fragile — users must manually sync their vision model with their active chat model.

Suggested Fix

One of two approaches:

Option A (preferred): In _resolve_task_provider_model() for task="vision", fallback to the runtime active model instead of config default when auto-detecting.

Option B: Add ollama-cloud to _VISION_AUTO_PROVIDER_ORDER and/or add a vision-aware default model mapping for ollama-cloud in _API_KEY_PROVIDER_AUX_MODELS.

Option C: In resolve_vision_provider_client(), when the main provider is ollama-cloud, check models_dev_cache.json for the actually active model instead of _read_main_model().

Related Code

  • agent/auxiliary_client.py line ~2059: vision_model = _PROVIDER_VISION_MODELS.get(main_provider, main_model)
  • agent/auxiliary_client.py line ~985: def _read_main_model() reads config.yaml model.default (not the runtime active model)
  • agent/auxiliary_client.py line ~1939: _VISION_AUTO_PROVIDER_ORDER = ("openrouter", "nous") — ollama-cloud is not included
  • agent/auxiliary_client.py line ~146: "ollama-cloud": "nemotron-3-nano:30b" in _API_KEY_PROVIDER_AUX_MODELS

Checklist

  • I can reproduce this consistently with the steps above
  • I have tested the workaround
  • I have searched existing issues for duplicates

Steps to Reproduce

See description

Expected Behavior

Auto detection should work

Actual Behavior

See desc

Affected Component

CLI (interactive chat)

Messaging Platform (if gateway-related)

N/A (CLI only)

Debug Report

N/A

Operating System

MacOS 26.4.1

Python Version

No response

Hermes Version

No response

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

extent analysis

TL;DR

The most likely fix is to modify the _resolve_task_provider_model() function to fallback to the runtime active model instead of the config default when auto-detecting for the vision task.

Guidance

  • Identify the agent/auxiliary_client.py file and locate the _resolve_task_provider_model() function to apply the suggested fix.
  • Verify that the models_dev_cache.json file contains the correct active model information.
  • Consider adding ollama-cloud to _VISION_AUTO_PROVIDER_ORDER as an alternative solution.
  • Test the fix by running the vision tool with the modified code and verifying that it respects the actual active model.

Example

# In agent/auxiliary_client.py
def _resolve_task_provider_model(task, main_provider, main_model):
    # ...
    if task == "vision":
        # Fallback to runtime active model instead of config default
        vision_model = get_runtime_active_model()
        # ...
    # ...

Notes

The provided fix assumes that the get_runtime_active_model() function is available and correctly returns the active model. If this function is not available, an alternative approach may be needed.

Recommendation

Apply workaround by explicitly configuring the vision auxiliary to use the correct model, and consider submitting a PR to implement the suggested fix in the _resolve_task_provider_model() function.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING