hermes - 💡(How to fix) Fix Bug: image routing bypassed on api_server /v1/chat/completions — non-vision models receive raw image

Root Cause

_prepare_messages_for_non_vision_model (run_agent.py:8971, introduced by PR #16506) is called in _build_chat_kwargs at run_agent.py:9389, inside the legacy branch:

# Legacy flag path — reached only when get_provider_profile() returns None.
...
# Strip image parts for non-vision models (no-op when vision-capable).
_msgs_for_chat = self._prepare_messages_for_non_vision_model(api_messages)
return _ct.build_kwargs(model=..., messages=_msgs_for_chat, ...)

The profile branch above it (run_agent.py:9354), added by the later provider-modules refactor ("transport single-path"), has no equivalent call:

if _profile:
    ...
    return _ct.build_kwargs(
        model=self.model,
        messages=api_messages,        # ← image_url parts pass through unchanged
        ...
        provider_profile=_profile,
        ...
    )

Downstream, agent/transports/chat_completions.py::_build_kwargs_from_profile (line 393) calls profile.prepare_messages(sanitized) (line 402), but the default ProviderProfile.prepare_messages (providers/base.py:80) is pass-through and no bundled profile overrides it for vision handling. So there is no substitute path.

Because get_provider_profile() now returns a profile for every major provider, the non-vision fallback is effectively unreachable for real users on /v1/chat/completions.

Fix Action

Fixed

Fixed by PR: fix(api-server): apply non-vision image fallback (https://github.com/NousResearch/hermes-agent/pull/23743)
Fixed by PR: fix(agent): strip image parts for non-vision models on provider profile path (https://github.com/NousResearch/hermes-agent/pull/23750)
Fixed by PR: fix(agent): call _prepare_messages_for_non_vision_model on chat_completions profile path (https://github.com/NousResearch/hermes-agent/pull/23818)
Fixed by PR: fix: prepare messages for non-vision model in provider profile code path (#23733) (https://github.com/NousResearch/hermes-agent/pull/23904)

Code Example

{
     "model": "hermes-agent",
     "messages": [
       {"role": "user", "content": [
         {"type": "text", "text": "what's in this image?"},
         {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
       ]}
     ]
   }

---

run_agent: conversation turn: ... model=deepseek-v4-pro provider=opencode-go platform=api_server history=1
  msg='[1 image] <recent_messages> ... </recent_messages> <cur...'

run_agent: Streaming failed before delivery: Error code: 400 -
  {'error': {'message': "Error from provider (DeepSeek): Failed to deserialize the
   JSON body into the target type: messages[2]: unknown variant `image_url`,
   expected `text` at line 1 column 7586", ...}}

aiohttp.access: ... "POST /v1/chat/completions HTTP/1.1" 502 802 ...

---

# Legacy flag path — reached only when get_provider_profile() returns None.
...
# Strip image parts for non-vision models (no-op when vision-capable).
_msgs_for_chat = self._prepare_messages_for_non_vision_model(api_messages)
return _ct.build_kwargs(model=..., messages=_msgs_for_chat, ...)

---

if _profile:
    ...
    return _ct.build_kwargs(
        model=self.model,
        messages=api_messages,        # ← image_url parts pass through unchanged
        ...
        provider_profile=_profile,
        ...
    )

---

if _profile:
     _ephemeral_out = getattr(self, "_ephemeral_max_output_tokens", None)
     if _ephemeral_out is not None:
         self._ephemeral_max_output_tokens = None

+    # Strip image parts for non-vision models (no-op when vision-capable).
+    api_messages = self._prepare_messages_for_non_vision_model(api_messages)
+
     return _ct.build_kwargs(
         model=self.model,
         messages=api_messages,
         ...
         provider_profile=_profile,
         ...
     )

Bug Description

/v1/chat/completions accepts OpenAI-style multimodal content with image_url parts. When the active main model does not support vision (e.g. deepseek-v4-pro), AIAgent._prepare_messages_for_non_vision_model (run_agent.py:8971) is supposed to replace each image part with a cached vision_analyze text description so the request doesn't fail at the provider.

That fallback runs on the legacy (unregistered-provider) branch of _build_chat_kwargs but not on the provider-profile branch — which is the path every registered provider takes (opencode-zen, opencode-go, deepseek, kimi, openrouter, gemini, anthropic, etc.). The profile branch forwards image_url parts unchanged to the upstream provider, which fails with HTTP 400 on text-only models.

The codex_responses path (run_agent.py:9267) does still call the fallback, and the gateway-adapter path (gateway/run.py → _decide_image_input_mode → _enrich_message_with_vision) also handles it correctly — so the inconsistency is specifically on the chat_completions profile branch.

Steps to Reproduce

Configure Hermes with a non-vision main model on a registered provider, e.g. provider=opencode-go, model=deepseek-v4-pro.

POST /v1/chat/completions with a user message in the OpenAI multimodal array form:

{
  "model": "hermes-agent",
  "messages": [
    {"role": "user", "content": [
      {"type": "text", "text": "what's in this image?"},
      {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
    ]}
  ]
}

The agent reaches run_conversation → _build_chat_kwargs → profile branch (run_agent.py:9354), then ships the image_url part to DeepSeek via opencode-go.

Expected Behavior

Same as the legacy branch and the gateway-adapter path: image parts are replaced with vision_analyze text inline, the provider sees text-only content, and the turn succeeds. Whether a provider is registered should not change the user-visible behavior of /v1/chat/completions.

Actual Behavior

Provider returns HTTP 400:

run_agent: conversation turn: ... model=deepseek-v4-pro provider=opencode-go platform=api_server history=1
  msg='[1 image] <recent_messages> ... </recent_messages> <cur...'

run_agent: Streaming failed before delivery: Error code: 400 -
  {'error': {'message': "Error from provider (DeepSeek): Failed to deserialize the
   JSON body into the target type: messages[2]: unknown variant `image_url`,
   expected `text` at line 1 column 7586", ...}}

aiohttp.access: ... "POST /v1/chat/completions HTTP/1.1" 502 802 ...

The retry loop classifies the 400 as BadRequestError, non-retryable, and the request 502s.

Root Cause Analysis

_prepare_messages_for_non_vision_model (run_agent.py:8971, introduced by PR #16506) is called in _build_chat_kwargs at run_agent.py:9389, inside the legacy branch:

# Legacy flag path — reached only when get_provider_profile() returns None.
...
# Strip image parts for non-vision models (no-op when vision-capable).
_msgs_for_chat = self._prepare_messages_for_non_vision_model(api_messages)
return _ct.build_kwargs(model=..., messages=_msgs_for_chat, ...)

The profile branch above it (run_agent.py:9354), added by the later provider-modules refactor ("transport single-path"), has no equivalent call:

if _profile:
    ...
    return _ct.build_kwargs(
        model=self.model,
        messages=api_messages,        # ← image_url parts pass through unchanged
        ...
        provider_profile=_profile,
        ...
    )

Because get_provider_profile() now returns a profile for every major provider, the non-vision fallback is effectively unreachable for real users on /v1/chat/completions.

Proposed Fix

Add the same call into the profile branch:

 if _profile:
     _ephemeral_out = getattr(self, "_ephemeral_max_output_tokens", None)
     if _ephemeral_out is not None:
         self._ephemeral_max_output_tokens = None

+    # Strip image parts for non-vision models (no-op when vision-capable).
+    api_messages = self._prepare_messages_for_non_vision_model(api_messages)
+
     return _ct.build_kwargs(
         model=self.model,
         messages=api_messages,
         ...
         provider_profile=_profile,
         ...
     )

_prepare_messages_for_non_vision_model short-circuits when no message contains image parts and again when _model_supports_vision() returns True, so the change is free for vision-capable / no-image cases.

Environment

Verified on origin/main commit 64145a199 and v0.13.0 (2026-05-07). Profile-branch refactor and _prepare_messages_for_non_vision_model both landed before v0.13.0; no fix in the 94 commits between v0.13.0 and current main.
OS: Debian GNU/Linux 13 (trixie)

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Bug: image routing bypassed on api_server /v1/chat/completions — non-vision models receive raw image_url (400) [4 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Proposed Fix

Environment

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Bug: image routing bypassed on api_server /v1/chat/completions — non-vision models receive raw image_url (400) [4 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Proposed Fix

Environment

Still need to ship something?

RELATED_DISCOVERY

TRENDING