hermes - 💡(How to fix) Fix [Bug] Ollama Cloud vision models return 500 on /v1/chat/completions — need native /api/chat adapter [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14592Fetched 2026-04-24 06:16:09
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×4commented ×1cross-referenced ×1

Error Message

When using Ollama Cloud as the vision provider (auxiliary.vision.provider: ollama-cloud), vision-capable models (e.g. qwen3-vl:235b) return 500 Internal Server Error when images are included in the request payload. Text-only requests to the same models work fine. 4. Observe 500 Internal Server Error from Ollama Cloud

  • Result: 500 Internal Server Error on image payloads, text-only works fine

Root Cause

Hermes sends all API calls through the OpenAI-compatible endpoint (/v1/chat/completions) using the OpenAI Python SDK. The DEFAULT_OLLAMA_CLOUD_BASE_URL is set to https://ollama.com/v1 (in hermes_cli/auth.py:74).

For vision, vision_tools.py constructs messages in OpenAI format:

"content": [
    {"type": "text", "text": "..."},
    {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]

However, Ollama Cloud's /v1/chat/completions endpoint has inconsistent vision support. The native Ollama API (/api/chat) works reliably with all vision models using a simpler format:

{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}

This is a known Ollama upstream issue:

  • ollama/ollama#13462 — Feature request to enable image input for all vision-capable cloud models (closed Dec 2025)
  • ollama/ollama#13464 — Cloud gemma3 models fail on /v1/chat/completions with images but work on /api/chat (fixed Jan 2026 for gemma3 only)
  • ollama/ollama#13468 — Library incorrectly showed "Text" only for vision-capable cloud models (fixed Mar 2026)

Other models (e.g. qwen3-vl:235b) may still return 500 on the OpenAI endpoint despite being reported as working.

Code Example

"content": [
    {"type": "text", "text": "..."},
    {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]

---

{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}

---

{
     "model": "qwen3-vl:235b",
     "messages": [{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}],
     "stream": false
   }
RAW_BUFFERClick to expand / collapse

Bug Description

When using Ollama Cloud as the vision provider (auxiliary.vision.provider: ollama-cloud), vision-capable models (e.g. qwen3-vl:235b) return 500 Internal Server Error when images are included in the request payload. Text-only requests to the same models work fine.

Root Cause

Hermes sends all API calls through the OpenAI-compatible endpoint (/v1/chat/completions) using the OpenAI Python SDK. The DEFAULT_OLLAMA_CLOUD_BASE_URL is set to https://ollama.com/v1 (in hermes_cli/auth.py:74).

For vision, vision_tools.py constructs messages in OpenAI format:

"content": [
    {"type": "text", "text": "..."},
    {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]

However, Ollama Cloud's /v1/chat/completions endpoint has inconsistent vision support. The native Ollama API (/api/chat) works reliably with all vision models using a simpler format:

{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}

This is a known Ollama upstream issue:

  • ollama/ollama#13462 — Feature request to enable image input for all vision-capable cloud models (closed Dec 2025)
  • ollama/ollama#13464 — Cloud gemma3 models fail on /v1/chat/completions with images but work on /api/chat (fixed Jan 2026 for gemma3 only)
  • ollama/ollama#13468 — Library incorrectly showed "Text" only for vision-capable cloud models (fixed Mar 2026)

Other models (e.g. qwen3-vl:235b) may still return 500 on the OpenAI endpoint despite being reported as working.

Steps to Reproduce

  1. Configure auxiliary.vision.provider: ollama-cloud and auxiliary.vision.model: qwen3-vl:235b in config.yaml
  2. Have a valid OLLAMA_API_KEY set
  3. Call vision_analyze with any image
  4. Observe 500 Internal Server Error from Ollama Cloud

Expected Behavior

Vision analysis should work with Ollama Cloud vision models, either by:

  1. Using the native /api/chat endpoint with the images array format for Ollama Cloud vision requests
  2. Or correctly formatting the OpenAI-compatible request so Ollama Cloud accepts it

Proposed Solution

Add an Ollama native API adapter for vision requests in auxiliary_client.py:

  1. In call_llm/async_call_llm, detect when resolved_provider == "ollama-cloud" and task == "vision"
  2. Instead of calling client.chat.completions.create() (which hits /v1/chat/completions), make a direct HTTP POST to https://ollama.com/api/chat with the native payload:
    {
      "model": "qwen3-vl:235b",
      "messages": [{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}],
      "stream": false
    }
  3. Convert the OpenAI-format image_url content blocks to the native images array
  4. Parse the native response back to a format extract_content_or_reasoning() can handle

Key files to modify:

  • agent/auxiliary_client.pyresolve_vision_provider_client(), call_llm(), async_call_llm()
  • tools/vision_tools.py — message construction (lines 532-548)

Environment

  • Hermes Agent: latest (v0.8+)
  • Provider: Ollama Cloud
  • Models tested: qwen3-vl:235b, qwen3.5:397b
  • Result: 500 Internal Server Error on image payloads, text-only works fine

Related

  • ollama/ollama#13462, ollama/ollama#13464, ollama/ollama#13468

extent analysis

TL;DR

Implement an Ollama native API adapter in auxiliary_client.py to handle vision requests directly through the /api/chat endpoint.

Guidance

  1. Detect Ollama Cloud vision requests: In auxiliary_client.py, modify call_llm and async_call_llm to check for resolved_provider == "ollama-cloud" and task == "vision".
  2. Construct native payload: Convert OpenAI-format image_url content blocks to the native images array format for the Ollama Cloud /api/chat endpoint.
  3. Make direct HTTP POST: Instead of using client.chat.completions.create(), make a direct HTTP POST to https://ollama.com/api/chat with the native payload.
  4. Parse native response: Modify extract_content_or_reasoning() to handle the native response format from the Ollama Cloud /api/chat endpoint.

Example

# auxiliary_client.py
if resolved_provider == "ollama-cloud" and task == "vision":
    # Construct native payload
    native_payload = {
        "model": "qwen3-vl:235b",
        "messages": [{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}],
        "stream": False
    }
    # Make direct HTTP POST to /api/chat
    response = requests.post("https://ollama.com/api/chat", json=native_payload)
    # Parse native response
    response_data = response.json()
    # ...

Notes

This solution assumes that the Ollama Cloud /api/chat endpoint is the correct endpoint for vision requests. However, the issue mentions that other models may still return 500 errors despite being reported as working.

Recommendation

Apply the proposed solution by implementing the Ollama native API adapter in auxiliary_client.py,

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING