hermes - 💡(How to fix) Fix Bug: image routing bypassed on api_server /v1/chat/completions — non-vision models receive raw image_url (400) [4 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

run_agent: Streaming failed before delivery: Error code: 400 - {'error': {'message': "Error from provider (DeepSeek): Failed to deserialize the

Root Cause

_prepare_messages_for_non_vision_model (run_agent.py:8971, introduced by PR #16506) is called in _build_chat_kwargs at run_agent.py:9389, inside the legacy branch:

# Legacy flag path — reached only when get_provider_profile() returns None.
...
# Strip image parts for non-vision models (no-op when vision-capable).
_msgs_for_chat = self._prepare_messages_for_non_vision_model(api_messages)
return _ct.build_kwargs(model=..., messages=_msgs_for_chat, ...)

The profile branch above it (run_agent.py:9354), added by the later provider-modules refactor ("transport single-path"), has no equivalent call:

if _profile:
    ...
    return _ct.build_kwargs(
        model=self.model,
        messages=api_messages,        # ← image_url parts pass through unchanged
        ...
        provider_profile=_profile,
        ...
    )

Downstream, agent/transports/chat_completions.py::_build_kwargs_from_profile (line 393) calls profile.prepare_messages(sanitized) (line 402), but the default ProviderProfile.prepare_messages (providers/base.py:80) is pass-through and no bundled profile overrides it for vision handling. So there is no substitute path.

Because get_provider_profile() now returns a profile for every major provider, the non-vision fallback is effectively unreachable for real users on /v1/chat/completions.

Fix Action

Fixed

Code Example

{
     "model": "hermes-agent",
     "messages": [
       {"role": "user", "content": [
         {"type": "text", "text": "what's in this image?"},
         {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
       ]}
     ]
   }

---

run_agent: conversation turn: ... model=deepseek-v4-pro provider=opencode-go platform=api_server history=1
  msg='[1 image] <recent_messages> ... </recent_messages> <cur...'

run_agent: Streaming failed before delivery: Error code: 400 -
  {'error': {'message': "Error from provider (DeepSeek): Failed to deserialize the
   JSON body into the target type: messages[2]: unknown variant `image_url`,
   expected `text` at line 1 column 7586", ...}}

aiohttp.access: ... "POST /v1/chat/completions HTTP/1.1" 502 802 ...

---

# Legacy flag path — reached only when get_provider_profile() returns None.
...
# Strip image parts for non-vision models (no-op when vision-capable).
_msgs_for_chat = self._prepare_messages_for_non_vision_model(api_messages)
return _ct.build_kwargs(model=..., messages=_msgs_for_chat, ...)

---

if _profile:
    ...
    return _ct.build_kwargs(
        model=self.model,
        messages=api_messages,        # ← image_url parts pass through unchanged
        ...
        provider_profile=_profile,
        ...
    )

---

if _profile:
     _ephemeral_out = getattr(self, "_ephemeral_max_output_tokens", None)
     if _ephemeral_out is not None:
         self._ephemeral_max_output_tokens = None

+    # Strip image parts for non-vision models (no-op when vision-capable).
+    api_messages = self._prepare_messages_for_non_vision_model(api_messages)
+
     return _ct.build_kwargs(
         model=self.model,
         messages=api_messages,
         ...
         provider_profile=_profile,
         ...
     )
RAW_BUFFERClick to expand / collapse

Bug Description

/v1/chat/completions accepts OpenAI-style multimodal content with image_url parts. When the active main model does not support vision (e.g. deepseek-v4-pro), AIAgent._prepare_messages_for_non_vision_model (run_agent.py:8971) is supposed to replace each image part with a cached vision_analyze text description so the request doesn't fail at the provider.

That fallback runs on the legacy (unregistered-provider) branch of _build_chat_kwargs but not on the provider-profile branch — which is the path every registered provider takes (opencode-zen, opencode-go, deepseek, kimi, openrouter, gemini, anthropic, etc.). The profile branch forwards image_url parts unchanged to the upstream provider, which fails with HTTP 400 on text-only models.

The codex_responses path (run_agent.py:9267) does still call the fallback, and the gateway-adapter path (gateway/run.py_decide_image_input_mode_enrich_message_with_vision) also handles it correctly — so the inconsistency is specifically on the chat_completions profile branch.

Steps to Reproduce

  1. Configure Hermes with a non-vision main model on a registered provider, e.g. provider=opencode-go, model=deepseek-v4-pro.
  2. POST /v1/chat/completions with a user message in the OpenAI multimodal array form:
    {
      "model": "hermes-agent",
      "messages": [
        {"role": "user", "content": [
          {"type": "text", "text": "what's in this image?"},
          {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
        ]}
      ]
    }
  3. The agent reaches run_conversation_build_chat_kwargs → profile branch (run_agent.py:9354), then ships the image_url part to DeepSeek via opencode-go.

Expected Behavior

Same as the legacy branch and the gateway-adapter path: image parts are replaced with vision_analyze text inline, the provider sees text-only content, and the turn succeeds. Whether a provider is registered should not change the user-visible behavior of /v1/chat/completions.

Actual Behavior

Provider returns HTTP 400:

run_agent: conversation turn: ... model=deepseek-v4-pro provider=opencode-go platform=api_server history=1
  msg='[1 image] <recent_messages> ... </recent_messages> <cur...'

run_agent: Streaming failed before delivery: Error code: 400 -
  {'error': {'message': "Error from provider (DeepSeek): Failed to deserialize the
   JSON body into the target type: messages[2]: unknown variant `image_url`,
   expected `text` at line 1 column 7586", ...}}

aiohttp.access: ... "POST /v1/chat/completions HTTP/1.1" 502 802 ...

The retry loop classifies the 400 as BadRequestError, non-retryable, and the request 502s.

Root Cause Analysis

_prepare_messages_for_non_vision_model (run_agent.py:8971, introduced by PR #16506) is called in _build_chat_kwargs at run_agent.py:9389, inside the legacy branch:

# Legacy flag path — reached only when get_provider_profile() returns None.
...
# Strip image parts for non-vision models (no-op when vision-capable).
_msgs_for_chat = self._prepare_messages_for_non_vision_model(api_messages)
return _ct.build_kwargs(model=..., messages=_msgs_for_chat, ...)

The profile branch above it (run_agent.py:9354), added by the later provider-modules refactor ("transport single-path"), has no equivalent call:

if _profile:
    ...
    return _ct.build_kwargs(
        model=self.model,
        messages=api_messages,        # ← image_url parts pass through unchanged
        ...
        provider_profile=_profile,
        ...
    )

Downstream, agent/transports/chat_completions.py::_build_kwargs_from_profile (line 393) calls profile.prepare_messages(sanitized) (line 402), but the default ProviderProfile.prepare_messages (providers/base.py:80) is pass-through and no bundled profile overrides it for vision handling. So there is no substitute path.

Because get_provider_profile() now returns a profile for every major provider, the non-vision fallback is effectively unreachable for real users on /v1/chat/completions.

Proposed Fix

Add the same call into the profile branch:

 if _profile:
     _ephemeral_out = getattr(self, "_ephemeral_max_output_tokens", None)
     if _ephemeral_out is not None:
         self._ephemeral_max_output_tokens = None

+    # Strip image parts for non-vision models (no-op when vision-capable).
+    api_messages = self._prepare_messages_for_non_vision_model(api_messages)
+
     return _ct.build_kwargs(
         model=self.model,
         messages=api_messages,
         ...
         provider_profile=_profile,
         ...
     )

_prepare_messages_for_non_vision_model short-circuits when no message contains image parts and again when _model_supports_vision() returns True, so the change is free for vision-capable / no-image cases.

Environment

  • Verified on origin/main commit 64145a199 and v0.13.0 (2026-05-07). Profile-branch refactor and _prepare_messages_for_non_vision_model both landed before v0.13.0; no fix in the 94 commits between v0.13.0 and current main.
  • OS: Debian GNU/Linux 13 (trixie)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING