hermes - 💡(How to fix) Fix Allow video analysis to use a separate auxiliary backend from image vision

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fix / Workaround

Local workaround tested

A local patch in tools/vision_tools.py reads auxiliary.video.provider/model/base_url/api_key/timeout for video_analyze_tool() and passes those values to the auxiliary client. This allowed images to remain on native model vision while videos used Gemini.

Code Example

agent:
  image_input_mode: native
auxiliary:
  vision:
    provider: auto
    model: ''
  video:
    provider: gemini
    model: gemini-3.1-flash-lite
    timeout: 180
RAW_BUFFERClick to expand / collapse

Feature description

Hermes currently has auxiliary.vision for image/screenshot/video-like analysis routing. In practice, image and native video workloads often need different backends:

  • Images/screenshots may work best on the active main model's native vision path.
  • Native video analysis may require a provider/model that explicitly supports video input, such as Gemini video-capable models.

It would be useful for video_analyze to support a separate auxiliary.video config block while leaving auxiliary.vision and main-model native image routing untouched.

Motivation

Without a separate video backend, users must either route all vision tasks to a video-capable auxiliary model or risk image-capable-but-not-video-capable models receiving video inputs. This is especially awkward for profiles whose main model supports images but not native video.

Proposed solution

Support config such as:

agent:
  image_input_mode: native
auxiliary:
  vision:
    provider: auto
    model: ''
  video:
    provider: gemini
    model: gemini-3.1-flash-lite
    timeout: 180

Then video_analyze should prefer auxiliary.video when present, while image analysis continues to use auxiliary.vision / native image routing.

Local workaround tested

A local patch in tools/vision_tools.py reads auxiliary.video.provider/model/base_url/api_key/timeout for video_analyze_tool() and passes those values to the auxiliary client. This allowed images to remain on native model vision while videos used Gemini.

Environment

  • Hermes checkout commit: 9f182bd7b
  • OS: Ubuntu 24.04 / Linux

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Allow video analysis to use a separate auxiliary backend from image vision