hermes - 💡(How to fix) Fix auxiliary vision: explicit base_url routes through generic "custom" branch, leaking main model name + OPENAI_API_KEY to the configured backend (Gemini)

StepCodex · 2026-06-05T05:24:30Z

[hermes] When auxiliary.vision is pointed at a non-OpenAI provider via an explicit base url e.g. Google Gemini's https://generativelanguage.googleapis.com/v1be… When `auxiliary.vision` is pointed at a non-OpenAI provider via an explicit `base_url` (e.g. Google Gemini's `https://generativelanguage.googleapis.com/v1beta`), the request is routed through the generic **"custom" OpenAI-compatible branch** of `resolve_provider_client()`. That branch back-fills two values from OpenAI/main-session defaults that are invalid for the target provider: 1. **Model name leak** — if `auxiliary.vision.model` is empty, the *main session model* (e.g. `gpt-5.5`) is sent to Gemini → `404 models/gpt-5.5 is not found for API version v1main`. 2. **API-key leak** — if `auxiliary.vision.api_key` is empty, `OPENAI_API_KEY` (`sk-proj-…`) is sent as the Bearer token to Gemini → `500 INTERNAL` (Gemini can't parse a foreign key, and returns 500 rather than a clean 401, so it looks like a transient outage). Net effect: a user who configures a dedicated Gemini vision backend with a paid AI-Studio key gets cryptic failures that *look* like Gemini-side problems but are actually local mis-routing. --- ## Fix / Workaround - **`vision_analyze` rejects video** (`tools/vision_tools.py`): returns `"Only real image files are supported for vision analysis."` Gemini 2.5 Flash supports video, but there is no server-side frame-extraction (or Files-API upload) path, so video attachments can't be analyzed without an out-of-band workaround. Consider extracting a frame (or uploading via the provider's media API) before the vision call. ## Workaround (until fixed) # Auxiliary vision routing sends the main session model + `OPENAI_API_KEY` to a configured non-OpenAI backend (Gemini) **Component:** auxiliary task routing / `vision_analyze` **Version:** Hermes Agent v0.15.1 (2026.5.29), Python 3.11 **Severity:** High — a correctly-configured `auxiliary.vision` Gemini backend silently fails; errors masquerade as upstream `404`/`500`s. --- ## Related issues - #33389 (open) — "auxiliary.vision.provider: gemini … not honored — falls through to main provider." Its root-cause analysis (gemini missing from `_VISION_AUTO_PROVIDER_ORDER` / `_resolve_strict_vision_backend`) is **stale for v0.15.1**: gemini is now present in both (`auxiliary_client.py:3955-3961` and `:4017`). The fall-through still happens, but for a different reason — described below. - #35454 (open) — Gemini routed through native adapter even when OpenAI-compatible endpoint configured. - #31179 (closed) — vision_analyze/browser_vision route images to main model. The Bug-1 leak here is the same *symptom* via the explicit-`base_url` path, which #31179's fix did not cover. ## Summary When `auxiliary.vision` is pointed at a non-OpenAI provider via an explicit `base_url` (e.g. Google Gemini's `https://generativelanguage.googleapis.com/v1beta`), the request is routed through the generic **"custom" OpenAI-compatible branch** of `resolve_provider_client()`. That branch back-fills two values from OpenAI/main-session defaults that are invalid for the target provider: 1. **Model name leak** — if `auxiliary.vision.model` is empty, the *main session model* (e.g. `gpt-5.5`) is sent to Gemini → `404 models/gpt-5.5 is not found for API version v1main`. 2. **API-key leak** — if `auxiliary.vision.api_key` is empty, `OPENAI_API_KEY` (`sk-proj-…`) is sent as the Bearer token to Gemini → `500 INTERNAL` (Gemini can't parse a foreign key, and returns 500 rather than a clean 401, so it looks like a transient outage). Net effect: a user who configures a dedicated Gemini vision backend with a paid AI-Studio key gets cryptic failures that *look* like Gemini-side problems but are actually local mis-routing. --- ## Environment / config to reproduce `~/.hermes/config.yaml`: ```yaml model: provider: openai-codex default: gpt-5.5 auxiliary: vision: provider: gemini model: gemini-2.5-flash # Bug 1 triggers when this is empty base_url: https://generativelanguage.googleapis.com/v1beta api_key: '' # Bug 2 triggers when this is empty agent: image_input_mode: auto ``` Env: `OPENAI_API_KEY=sk-proj-…` present (any unrelated OpenAI key), `GOOGLE_API_KEY`/`GEMINI_API_KEY` present and valid. Send any image; `image_input_mode: auto` + an explicit `auxiliary.vision.provider` routes to `text` mode (`agent/image_routing.py: decide_image_input_mode`), which calls `vision_analyze` → `call_llm(task="vision", …)`. --- ## Bug 1 — main session model name leaks into the aux call **Trigger:** `auxiliary.vision.model` empty/unset. **Path:** - `_resolve_task_provider_model("vision", …)` → `resolved_model = model or cfg_model` (`agent/auxiliary_client.py:4627`). Empty `cfg_model` ⇒ `resolved_model = None`. - `resolve_provider_client()` then hits its universal model fallback (`agent/auxiliary_client.py:3345-3346`): ```python if not model: model = _get_aux_model_for_provider(provider) or _read_main_model() or model ``` For Gemini there is no registered aux defaul

hermes2026-06-05 05:24:30

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

When auxiliary.vision is pointed at a non-OpenAI provider via an explicit base_url (e.g. Google Gemini's https://generativelanguage.googleapis.com/v1beta), the request is routed through the generic "custom" OpenAI-compatible branch of resolve_provider_client(). That branch back-fills two values from OpenAI/main-session defaults that are invalid for the target provider:

Model name leak — if auxiliary.vision.model is empty, the main session model (e.g. gpt-5.5) is sent to Gemini → 404 models/gpt-5.5 is not found for API version v1main.
API-key leak — if auxiliary.vision.api_key is empty, OPENAI_API_KEY (sk-proj-…) is sent as the Bearer token to Gemini → 500 INTERNAL (Gemini can't parse a foreign key, and returns 500 rather than a clean 401, so it looks like a transient outage).

Net effect: a user who configures a dedicated Gemini vision backend with a paid AI-Studio key gets cryptic failures that look like Gemini-side problems but are actually local mis-routing.

Error Message

the target provider's default (or raise a clear config error) — never the main session Actual: Gemini returns 500 - {"error":{"code":500,"status":"INTERNAL"}}. OPENAI_API_KEY. A wrong/foreign key should surface as a clear auth error, not a 500.

Root Cause

Root cause: the "use my main model for side tasks too" fallback (intended for same-provider side tasks like title generation) is applied even when the call targets a different provider's explicit base_url.

Fix Action

Fix / Workaround

vision_analyze rejects video (tools/vision_tools.py): returns "Only real image files are supported for vision analysis." Gemini 2.5 Flash supports video, but there is no server-side frame-extraction (or Files-API upload) path, so video attachments can't be analyzed without an out-of-band workaround. Consider extracting a frame (or uploading via the provider's media API) before the vision call.

Workaround (until fixed)

Code Example

model:
  provider: openai-codex
  default: gpt-5.5
auxiliary:
  vision:
    provider: gemini
    model: gemini-2.5-flash          # Bug 1 triggers when this is empty
    base_url: https://generativelanguage.googleapis.com/v1beta
    api_key: ''                      # Bug 2 triggers when this is empty
agent:
  image_input_mode: auto

---

if not model:
      model = _get_aux_model_for_provider(provider) or _read_main_model() or model

---

if provider == "custom":
    if explicit_base_url:
        custom_key = (
            (explicit_api_key or "").strip()
            or os.getenv("OPENAI_API_KEY", "").strip()
            or "no-key-required"
        )

---

auxiliary:
  vision:
    provider: gemini
    model: gemini-2.5-flash
    base_url: https://generativelanguage.googleapis.com/v1beta
    api_key: AIza…            # inline AI-Studio key; do NOT leave empty

RAW_BUFFERClick to expand / collapse

Auxiliary vision routing sends the main session model + `OPENAI_API_KEY` to a configured non-OpenAI backend (Gemini)

Component: auxiliary task routing / vision_analyze Version: Hermes Agent v0.15.1 (2026.5.29), Python 3.11 Severity: High — a correctly-configured auxiliary.vision Gemini backend silently fails; errors masquerade as upstream 404/500s.

Related issues

#33389 (open) — "auxiliary.vision.provider: gemini … not honored — falls through to main provider." Its root-cause analysis (gemini missing from _VISION_AUTO_PROVIDER_ORDER / _resolve_strict_vision_backend) is stale for v0.15.1: gemini is now present in both (auxiliary_client.py:3955-3961 and :4017). The fall-through still happens, but for a different reason — described below.
#35454 (open) — Gemini routed through native adapter even when OpenAI-compatible endpoint configured.
#31179 (closed) — vision_analyze/browser_vision route images to main model. The Bug-1 leak here is the same symptom via the explicit-base_url path, which #31179's fix did not cover.

Summary

Model name leak — if auxiliary.vision.model is empty, the main session model (e.g. gpt-5.5) is sent to Gemini → 404 models/gpt-5.5 is not found for API version v1main.
API-key leak — if auxiliary.vision.api_key is empty, OPENAI_API_KEY (sk-proj-…) is sent as the Bearer token to Gemini → 500 INTERNAL (Gemini can't parse a foreign key, and returns 500 rather than a clean 401, so it looks like a transient outage).

Net effect: a user who configures a dedicated Gemini vision backend with a paid AI-Studio key gets cryptic failures that look like Gemini-side problems but are actually local mis-routing.

Environment / config to reproduce

~/.hermes/config.yaml:

model:
  provider: openai-codex
  default: gpt-5.5
auxiliary:
  vision:
    provider: gemini
    model: gemini-2.5-flash          # Bug 1 triggers when this is empty
    base_url: https://generativelanguage.googleapis.com/v1beta
    api_key: ''                      # Bug 2 triggers when this is empty
agent:
  image_input_mode: auto

Env: OPENAI_API_KEY=sk-proj-… present (any unrelated OpenAI key), GOOGLE_API_KEY/GEMINI_API_KEY present and valid.

Send any image; image_input_mode: auto + an explicit auxiliary.vision.provider routes to text mode (agent/image_routing.py: decide_image_input_mode), which calls vision_analyze → call_llm(task="vision", …).

Bug 1 — main session model name leaks into the aux call

Trigger: auxiliary.vision.model empty/unset.

Path:

_resolve_task_provider_model("vision", …) → resolved_model = model or cfg_model (agent/auxiliary_client.py:4627). Empty cfg_model ⇒ resolved_model = None.
resolve_provider_client() then hits its universal model fallback (agent/auxiliary_client.py:3345-3346):
```
if not model:
    model = _get_aux_model_for_provider(provider) or _read_main_model() or model
```
For Gemini there is no registered aux default, so step 3 (_read_main_model()) injects gpt-5.5. The "custom" branch has a second copy of this default (auxiliary_client.py:3514-3515): model or main_runtime.model or "gpt-4o-mini".

Actual: POST …/v1beta/chat/completions {"model":"gpt-5.5", …} → 404 - models/gpt-5.5 is not found for API version v1main.

Expected: the configured auxiliary.vision.model is used; if genuinely unset, fall back to the target provider's default (or raise a clear config error) — never the main session model, which belongs to a different provider.

Bug 2 — `OPENAI_API_KEY` is sent to a non-OpenAI base_url

Trigger: auxiliary.vision.api_key empty/unset (relying on env), with a non-OpenAI base_url.

Path: the explicit-base_url vision path resolves through the custom-endpoint branch (agent/auxiliary_client.py:3499-3507):

if provider == "custom":
    if explicit_base_url:
        custom_key = (
            (explicit_api_key or "").strip()
            or os.getenv("OPENAI_API_KEY", "").strip()
            or "no-key-required"
        )

With explicit_api_key=None, this yields OPENAI_API_KEY. (Confirmed by capturing the outbound request: Authorization: Bearer sk-proj-… to generativelanguage.googleapis.com.)

Actual: Gemini returns 500 - {"error":{"code":500,"status":"INTERNAL"}}.

Expected: key resolution should be provider-aware. For a generativelanguage.googleapis.com base_url (or provider: gemini), prefer GEMINI_API_KEY / GOOGLE_API_KEY before OPENAI_API_KEY. A wrong/foreign key should surface as a clear auth error, not a 500.

Why both happen: non-OpenAI base_url → generic "custom" branch

resolve_vision_provider_client() (auxiliary_client.py:4082) treats any explicit base_url as a custom endpoint, and resolve_provider_client funnels it into the provider == "custom" branch (logged as Auxiliary vision: using custom (…)). That branch is OpenAI-shaped end-to-end: it defaults the model to OpenAI/main values and the key to OPENAI_API_KEY. When the endpoint is actually Gemini, both defaults are wrong.

Suggested fixes

Don't inject the main session model across providers. In resolve_provider_client (:3345-3346 and the custom branch :3514-3515), gate the _read_main_model() / main_runtime.model fallback so it only applies when the resolved endpoint belongs to the same provider as the main session. For a cross-provider aux base_url, use the target provider's default model or raise "auxiliary.<task>.model is required for this backend".
Provider-aware key resolution for known hosts. When base_url host matches a known provider (e.g. generativelanguage.googleapis.com → Gemini), resolve the key from that provider's env vars (GEMINI_API_KEY/GOOGLE_API_KEY) before OPENAI_API_KEY, instead of unconditionally treating an explicit base_url as an OpenAI-compatible "custom" endpoint.
Surface foreign-key failures clearly. A 500 from a downstream provider after a key/model substitution should be annotated ("sent main-model/OPENAI_API_KEY to <provider>") so it isn't mistaken for an upstream outage.

Related, lower priority

vision_analyze rejects video (tools/vision_tools.py): returns "Only real image files are supported for vision analysis." Gemini 2.5 Flash supports video, but there is no server-side frame-extraction (or Files-API upload) path, so video attachments can't be analyzed without an out-of-band workaround. Consider extracting a frame (or uploading via the provider's media API) before the vision call.

Workaround (until fixed)

Pin both fields explicitly in auxiliary.vision so neither fallback fires:

auxiliary:
  vision:
    provider: gemini
    model: gemini-2.5-flash
    base_url: https://generativelanguage.googleapis.com/v1beta
    api_key: AIza…            # inline AI-Studio key; do NOT leave empty

Verified: with both pinned, the outbound request carries {"model":"gemini-2.5-flash"} + Authorization: Bearer AIza… and returns 200 with a correct image description.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix auxiliary vision: explicit base_url routes through generic "custom" branch, leaking main model name + OPENAI_API_KEY to the configured backend (Gemini)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Workaround (until fixed)

Code Example

Auxiliary vision routing sends the main session model + `OPENAI_API_KEY` to a configured non-OpenAI backend (Gemini)

Related issues

Summary

Environment / config to reproduce

Bug 1 — main session model name leaks into the aux call

Bug 2 — `OPENAI_API_KEY` is sent to a non-OpenAI base_url

Why both happen: non-OpenAI base_url → generic "custom" branch

Suggested fixes

Related, lower priority

Workaround (until fixed)

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix auxiliary vision: explicit base_url routes through generic "custom" branch, leaking main model name + OPENAI_API_KEY to the configured backend (Gemini)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Workaround (until fixed)

Code Example

Auxiliary vision routing sends the main session model + OPENAI_API_KEY to a configured non-OpenAI backend (Gemini)

Related issues

Summary

Environment / config to reproduce

Bug 1 — main session model name leaks into the aux call

Bug 2 — OPENAI_API_KEY is sent to a non-OpenAI base_url

Why both happen: non-OpenAI base_url → generic "custom" branch

Suggested fixes

Related, lower priority

Workaround (until fixed)

Still need to ship something?

TRENDING

Auxiliary vision routing sends the main session model + `OPENAI_API_KEY` to a configured non-OpenAI backend (Gemini)

Bug 2 — `OPENAI_API_KEY` is sent to a non-OpenAI base_url