hermes - 💡(How to fix) Fix Allow video analysis to use a separate auxiliary backend from image vision

hermes2026-05-17 06:12:48

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Fix Action

Fix / Workaround

Local workaround tested

A local patch in tools/vision_tools.py reads auxiliary.video.provider/model/base_url/api_key/timeout for video_analyze_tool() and passes those values to the auxiliary client. This allowed images to remain on native model vision while videos used Gemini.

Code Example

agent:
  image_input_mode: native
auxiliary:
  vision:
    provider: auto
    model: ''
  video:
    provider: gemini
    model: gemini-3.1-flash-lite
    timeout: 180

RAW_BUFFERClick to expand / collapse

Feature description

Hermes currently has auxiliary.vision for image/screenshot/video-like analysis routing. In practice, image and native video workloads often need different backends:

Images/screenshots may work best on the active main model's native vision path.
Native video analysis may require a provider/model that explicitly supports video input, such as Gemini video-capable models.

It would be useful for video_analyze to support a separate auxiliary.video config block while leaving auxiliary.vision and main-model native image routing untouched.

Motivation

Without a separate video backend, users must either route all vision tasks to a video-capable auxiliary model or risk image-capable-but-not-video-capable models receiving video inputs. This is especially awkward for profiles whose main model supports images but not native video.

Proposed solution

Support config such as:

agent:
  image_input_mode: native
auxiliary:
  vision:
    provider: auto
    model: ''
  video:
    provider: gemini
    model: gemini-3.1-flash-lite
    timeout: 180

Then video_analyze should prefer auxiliary.video when present, while image analysis continues to use auxiliary.vision / native image routing.

Local workaround tested

Environment

Hermes checkout commit: 9f182bd7b
OS: Ubuntu 24.04 / Linux

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #GPU setup #container setup #orchestration issue #cache issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Allow video analysis to use a separate auxiliary backend from image vision

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Local workaround tested

Code Example

Feature description

Motivation

Proposed solution

Local workaround tested

Environment

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Allow video analysis to use a separate auxiliary backend from image vision

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Local workaround tested

Code Example

Feature description

Motivation

Proposed solution

Local workaround tested

Environment

Still need to ship something?

RELATED_DISCOVERY

TRENDING