hermes - 💡(How to fix) Fix [Bug] Ollama Cloud vision models return 500 on /v1/chat/completions — need native /api/chat adapter [1 comments, 2 participants]

connieka266 · 2026-04-23T14:47:18Z

[hermes] Bug Description When using Ollama Cloud as the vision provider auxiliary.vision.provider: ollama-cloud , vision-capable models e.g. qwen3-vl:235b retu… ## Bug Description When using Ollama Cloud as the vision provider (`auxiliary.vision.provider: ollama-cloud`), vision-capable models (e.g. `qwen3-vl:235b`) return `500 Internal Server Error` when images are included in the request payload. Text-only requests to the same models work fine. ## Root Cause Hermes sends **all** API calls through the OpenAI-compatible endpoint (`/v1/chat/completions`) using the OpenAI Python SDK. The `DEFAULT_OLLAMA_CLOUD_BASE_URL` is set to `https://ollama.com/v1` (in `hermes_cli/auth.py:74`). For vision, `vision_tools.py` constructs messages in OpenAI format: ```python "content": [ {"type": "text", "text": "..."}, {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}} ] ``` However, Ollama Cloud's `/v1/chat/completions` endpoint has inconsistent vision support. The **native Ollama API** (`/api/chat`) works reliably with all vision models using a simpler format: ```json {"role": "user", "content": "Describe this image.", "images": [" "]} ``` This is a known Ollama upstream issue: - **ollama/ollama#13462** — Feature request to enable image input for all vision-capable cloud models (closed Dec 2025) - **ollama/ollama#13464** — Cloud gemma3 models fail on `/v1/chat/completions` with images but work on `/api/chat` (fixed Jan 2026 for gemma3 only) - **ollama/ollama#13468** — Library incorrectly showed "Text" only for vision-capable cloud models (fixed Mar 2026) Other models (e.g. `qwen3-vl:235b`) may still return 500 on the OpenAI endpoint despite being reported as working. ## Steps to Reproduce 1. Configure `auxiliary.vision.provider: ollama-cloud` and `auxiliary.vision.model: qwen3-vl:235b` in `config.yaml` 2. Have a valid `OLLAMA_API_KEY` set 3. Call `vision_analyze` with any image 4. Observe `500 Internal Server Error` from Ollama Cloud ## Expected Behavior Vision analysis should work with Ollama Cloud vision models, either by: 1. Using the native `/api/chat` endpoint with the `images` array format for Ollama Cloud vision requests 2. Or correctly formatting the OpenAI-compatible request so Ollama Cloud accepts it ## Proposed Solution Add an **Ollama native API adapter** for vision requests in `auxiliary_client.py`: 1. In `call_llm`/`async_call_llm`, detect when `resolved_provider == "ollama-cloud"` and `task == "vision"` 2. Instead of calling `client.chat.completions.create()` (which hits `/v1/chat/completions`), make a direct HTTP POST to `https://ollama.com/api/chat` with the native payload: ```json { "model": "qwen3-vl:235b", "messages": [{"role": "user", "content": "Describe this image.", "images": [" "]}], "stream": false } ``` 3. Convert the OpenAI-format `image_url` content blocks to the native `images` array 4. Parse the native response back to a format `extract_content_or_reasoning()` can handle Key files to modify: - `agent/auxiliary_client.py` — `resolve_vision_provider_client()`, `call_llm()`, `async_call_llm()` - `tools/vision_tools.py` — message construction (lines 532-548) ## Environment - Hermes Agent: latest (v0.8+) - Provider: Ollama Cloud - Models tested: `qwen3-vl:235b`, `qwen3.5:397b` - Result: 500 Internal Server Error on image payloads, text-only works fine ## Related - ollama/ollama#13462, ollama/ollama#13464, ollama/ollama#13468

hermes2026-04-23 14:47:18

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#14592•Fetched 2026-04-24 06:16:09

View on GitHub

Comments

Participants

Timeline

Reactions

Author

connieka266

Participants

connieka266

ennian001

Timeline (top)

labeled ×4commented ×1cross-referenced ×1

Error Message

When using Ollama Cloud as the vision provider (auxiliary.vision.provider: ollama-cloud), vision-capable models (e.g. qwen3-vl:235b) return 500 Internal Server Error when images are included in the request payload. Text-only requests to the same models work fine. 4. Observe 500 Internal Server Error from Ollama Cloud

Result: 500 Internal Server Error on image payloads, text-only works fine

Root Cause

Hermes sends all API calls through the OpenAI-compatible endpoint (/v1/chat/completions) using the OpenAI Python SDK. The DEFAULT_OLLAMA_CLOUD_BASE_URL is set to https://ollama.com/v1 (in hermes_cli/auth.py:74).

For vision, vision_tools.py constructs messages in OpenAI format:

"content": [
    {"type": "text", "text": "..."},
    {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]

However, Ollama Cloud's /v1/chat/completions endpoint has inconsistent vision support. The native Ollama API (/api/chat) works reliably with all vision models using a simpler format:

{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}

This is a known Ollama upstream issue:

ollama/ollama#13462 — Feature request to enable image input for all vision-capable cloud models (closed Dec 2025)
ollama/ollama#13464 — Cloud gemma3 models fail on /v1/chat/completions with images but work on /api/chat (fixed Jan 2026 for gemma3 only)
ollama/ollama#13468 — Library incorrectly showed "Text" only for vision-capable cloud models (fixed Mar 2026)

Other models (e.g. qwen3-vl:235b) may still return 500 on the OpenAI endpoint despite being reported as working.

Code Example

"content": [
    {"type": "text", "text": "..."},
    {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]

---

{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}

---

{
     "model": "qwen3-vl:235b",
     "messages": [{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}],
     "stream": false
   }

RAW_BUFFERClick to expand / collapse

Bug Description

Root Cause

For vision, vision_tools.py constructs messages in OpenAI format:

"content": [
    {"type": "text", "text": "..."},
    {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
]

However, Ollama Cloud's /v1/chat/completions endpoint has inconsistent vision support. The native Ollama API (/api/chat) works reliably with all vision models using a simpler format:

{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}

This is a known Ollama upstream issue:

ollama/ollama#13462 — Feature request to enable image input for all vision-capable cloud models (closed Dec 2025)
ollama/ollama#13464 — Cloud gemma3 models fail on /v1/chat/completions with images but work on /api/chat (fixed Jan 2026 for gemma3 only)
ollama/ollama#13468 — Library incorrectly showed "Text" only for vision-capable cloud models (fixed Mar 2026)

Other models (e.g. qwen3-vl:235b) may still return 500 on the OpenAI endpoint despite being reported as working.

Steps to Reproduce

Configure auxiliary.vision.provider: ollama-cloud and auxiliary.vision.model: qwen3-vl:235b in config.yaml
Have a valid OLLAMA_API_KEY set
Call vision_analyze with any image
Observe 500 Internal Server Error from Ollama Cloud

Expected Behavior

Vision analysis should work with Ollama Cloud vision models, either by:

Using the native /api/chat endpoint with the images array format for Ollama Cloud vision requests
Or correctly formatting the OpenAI-compatible request so Ollama Cloud accepts it

Proposed Solution

Add an Ollama native API adapter for vision requests in auxiliary_client.py:

In call_llm/async_call_llm, detect when resolved_provider == "ollama-cloud" and task == "vision"
Instead of calling client.chat.completions.create() (which hits /v1/chat/completions), make a direct HTTP POST to https://ollama.com/api/chat with the native payload:
```
{
  "model": "qwen3-vl:235b",
  "messages": [{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}],
  "stream": false
}
```
Convert the OpenAI-format image_url content blocks to the native images array
Parse the native response back to a format extract_content_or_reasoning() can handle

Key files to modify:

agent/auxiliary_client.py — resolve_vision_provider_client(), call_llm(), async_call_llm()
tools/vision_tools.py — message construction (lines 532-548)

Environment

Hermes Agent: latest (v0.8+)
Provider: Ollama Cloud
Models tested: qwen3-vl:235b, qwen3.5:397b
Result: 500 Internal Server Error on image payloads, text-only works fine

ollama/ollama#13462, ollama/ollama#13464, ollama/ollama#13468

extent analysis

TL;DR

Implement an Ollama native API adapter in auxiliary_client.py to handle vision requests directly through the /api/chat endpoint.

Guidance

Detect Ollama Cloud vision requests: In auxiliary_client.py, modify call_llm and async_call_llm to check for resolved_provider == "ollama-cloud" and task == "vision".
Construct native payload: Convert OpenAI-format image_url content blocks to the native images array format for the Ollama Cloud /api/chat endpoint.
Make direct HTTP POST: Instead of using client.chat.completions.create(), make a direct HTTP POST to https://ollama.com/api/chat with the native payload.
Parse native response: Modify extract_content_or_reasoning() to handle the native response format from the Ollama Cloud /api/chat endpoint.

Example

# auxiliary_client.py
if resolved_provider == "ollama-cloud" and task == "vision":
    # Construct native payload
    native_payload = {
        "model": "qwen3-vl:235b",
        "messages": [{"role": "user", "content": "Describe this image.", "images": ["<base64>"]}],
        "stream": False
    }
    # Make direct HTTP POST to /api/chat
    response = requests.post("https://ollama.com/api/chat", json=native_payload)
    # Parse native response
    response_data = response.json()
    # ...

Notes

This solution assumes that the Ollama Cloud /api/chat endpoint is the correct endpoint for vision requests. However, the issue mentions that other models may still return 500 errors despite being reported as working.

Recommendation

Apply the proposed solution by implementing the Ollama native API adapter in auxiliary_client.py,

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #LLM response #prompt template #agent execution #callback error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug] Ollama Cloud vision models return 500 on /v1/chat/completions — need native /api/chat adapter [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Bug Description

Root Cause

Steps to Reproduce

Expected Behavior

Proposed Solution

Environment

Related

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Bug] Ollama Cloud vision models return 500 on /v1/chat/completions — need native /api/chat adapter [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Bug Description

Root Cause

Steps to Reproduce

Expected Behavior

Proposed Solution

Environment

Related

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING