hermes - ✅(Solved) Fix [Bug] vision_analyze times out with ollama-cloud — proxy lacks OpenAI Vision format support [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#23422Fetched 2026-05-11 03:29:33
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
0
Author
Timeline (top)
labeled ×5commented ×1cross-referenced ×1

Error Message

Error Output

Error analyzing image: Request timed out.

Root Cause

Root Cause Analysis

1. ollama-cloud is NOT in _PROVIDERS_WITHOUT_VISION

In agent/auxiliary_client.py (line 303):

_PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
    "kimi-coding",
    "kimi-coding-cn",
})

ollama-cloud is missing, so Hermes attempts to route vision through it instead of falling back to an aggregator.

Fix Action

Fix / Workaround

Workarounds Found

  • Switching to OpenRouter (google/gemini-3-flash-preview) works: 2.4s response time, correct image analysis
  • Using a local vision model (e.g. minicpm-v) via local-ollama also works

PR fix notes

PR #23511: fix(vision): skip ollama-cloud in auto vision routing

Description (problem / solution / changelog)

What does this PR do?

Skips ollama-cloud in auto vision routing so vision_analyze falls through to the aggregator chain instead of trying a provider path that does not accept image input.

Related Issue

Fixes #23422

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • add ollama-cloud to _PROVIDERS_WITHOUT_VISION in agent/auxiliary_client.py
  • add a regression test proving auto vision routing skips ollama-cloud and falls through to OpenRouter
  • extend the skip-set guard test to cover the new provider

How to Test

  1. uv run --frozen pytest -q -o addopts='' tests/agent/test_auxiliary_client.py -k 'VisionAutoSkipsKimiCoding or ollama_cloud_skips_to_aggregator_chain'
  2. uv run --frozen ruff check agent/auxiliary_client.py tests/agent/test_auxiliary_client.py
  3. env -i HOME="$HOME" PATH="$PATH" TERM="${TERM:-xterm-256color}" UV_CACHE_DIR="$HOME/.cache/uv" uv run --frozen pytest -q -o addopts='' --collect-only tests/agent/test_auxiliary_client.py -k 'VisionAutoSkipsKimiCoding or ollama_cloud_skips_to_aggregator_chain'

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 15.4.1

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

  • 5 passed, 143 deselected in 1.78s
  • All checks passed! from ruff check
  • collect-only smoke passed with 5/148 tests collected (143 deselected)

Changed files

  • agent/auxiliary_client.py (modified, +1/-0)
  • tests/agent/test_auxiliary_client.py (modified, +27/-0)

Code Example

_PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
    "kimi-coding",
    "kimi-coding-cn",
})

---

{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}

---

Error analyzing image: Request timed out.

---

_PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
    "kimi-coding",
    "kimi-coding-cn",
    "ollama-cloud",  # <-- ADD
})
RAW_BUFFERClick to expand / collapse

Bug Summary

vision_analyze fails with Request timed out when using ollama-cloud as the vision provider, even though Ollama Cloud models (e.g. glm-5.1:cloud, kimi-k2.6:cloud) are multimodal-capable.

Environment

  • OS: macOS 15.7.4 (Mac mini 2018 Intel)
  • Hermes Version: v0.12.x
  • Provider: ollama-cloud via https://ollama.com/v1
  • Models tested: glm-5.1:cloud, kimi-k2.6:cloud
  • Config: auxiliary.vision.provider: auto (also explicit ollama-cloud)

Reproduction Steps

  1. Set auxiliary.vision.provider: ollama-cloud or leave as auto with ollama-cloud as main provider
  2. Call vision_analyze with any local image file
  3. Observe: Request times out after 120-180 seconds

Root Cause Analysis

1. ollama-cloud is NOT in _PROVIDERS_WITHOUT_VISION

In agent/auxiliary_client.py (line 303):

_PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
    "kimi-coding",
    "kimi-coding-cn",
})

ollama-cloud is missing, so Hermes attempts to route vision through it instead of falling back to an aggregator.

2. Ollama Cloud Proxy does not support OpenAI Vision Format

Hermes vision_analyze_tool() sends images in OpenAI-compatible format:

{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}

But Ollama Cloud's /v1/chat/completions proxy does not accept this format for multimodal requests:

  • Direct curl test returns 401 Unauthorized on multimodal requests with base64 image data
  • The OAuth token that works for text chat does not authenticate vision requests
  • Ollama's native API uses images: [base64] array in /api/chat, not image_url

3. Cloud-side hard timeout

Ollama Cloud has a documented ~182-second hard server-side timeout (ollama/ollama#15973) that kills in-progress requests regardless of whether tokens are being generated.

Error Output

Error analyzing image: Request timed out.

Workarounds Found

  • Switching to OpenRouter (google/gemini-3-flash-preview) works: 2.4s response time, correct image analysis
  • Using a local vision model (e.g. minicpm-v) via local-ollama also works

Proposed Fix

Add "ollama-cloud" to _PROVIDERS_WITHOUT_VISION in agent/auxiliary_client.py:

_PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
    "kimi-coding",
    "kimi-coding-cn",
    "ollama-cloud",  # <-- ADD
})

This ensures Hermes skips ollama-cloud for vision and falls through to OpenRouter or local providers that actually support multimodal image input.

Related

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING