hermes - 💡(How to fix) Fix [Bug]: native image generation (image_generate) doesn't accept images as input

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Hermes currently lacks consistent end-to-end support for reference_images in image_generate across current backends. Effectively, this means we can generate images only via text prompts, we are unable to supply reference images as input.

At a high level, Hermes needs to support true image-conditioned editing across its image-generation backends. In current upstream main, that support is incomplete: the tool/backend pipeline does not consistently normalize, forward, and route reference images all the way to the provider-specific edit path.

As a result, a request that looks like an image edit at the agent/tool layer can degrade into prompt-only generation, or fail to use the backend's required edit-specific request shape.

Error Message

  1. Unsupported backend/model: fail clearly with a structured error.
  • unsupported backends return an explicit error

Root Cause

Hermes currently lacks consistent end-to-end support for reference_images in image_generate across current backends. Effectively, this means we can generate images only via text prompts, we are unable to supply reference images as input.

At a high level, Hermes needs to support true image-conditioned editing across its image-generation backends. In current upstream main, that support is incomplete: the tool/backend pipeline does not consistently normalize, forward, and route reference images all the way to the provider-specific edit path.

As a result, a request that looks like an image edit at the agent/tool layer can degrade into prompt-only generation, or fail to use the backend's required edit-specific request shape.

Fix Action

Fix / Workaround

Observed on upstream/main during code inspection:

  • image-gen dispatch paths do not consistently carry reference images end-to-end
  • backend-specific edit routing/payload shaping is incomplete
RAW_BUFFERClick to expand / collapse

Summary

Hermes currently lacks consistent end-to-end support for reference_images in image_generate across current backends. Effectively, this means we can generate images only via text prompts, we are unable to supply reference images as input.

At a high level, Hermes needs to support true image-conditioned editing across its image-generation backends. In current upstream main, that support is incomplete: the tool/backend pipeline does not consistently normalize, forward, and route reference images all the way to the provider-specific edit path.

As a result, a request that looks like an image edit at the agent/tool layer can degrade into prompt-only generation, or fail to use the backend's required edit-specific request shape.

Expected behavior

If reference_images is present, Hermes should do exactly one of these:

  1. Supported backend/model: normalize and forward the references, and use the correct edit endpoint/model variant if required.
  2. Unsupported backend/model: fail clearly with a structured error.

It should not silently degrade to text-only generation.

Why this is a bug

The missing contract is end-to-end backend handling for reference-image edits:

  • the tool layer needs a stable reference_images contract
  • plugin-backed providers need explicit passthrough
  • native backends need provider-specific payload shaping
  • models that require edit endpoints need edit-path routing when refs are present

Without that, Hermes cannot reliably perform identity-preserving or composition-preserving edits from reference inputs.

Repro shape

Call image_generate with:

  • a text prompt describing an edit
  • reference_images: [base_image, edit_reference]

Expected:

  • the backend receives actual image refs
  • Hermes uses the edit-capable request path when required

Observed on upstream/main during code inspection:

  • image-gen dispatch paths do not consistently carry reference images end-to-end
  • backend-specific edit routing/payload shaping is incomplete

Likely fix areas

  • tools/image_generation_tool.py
  • plugin provider passthrough for image-capable backends
  • native image backend payload builders
  • edit-endpoint routing when refs are present
  • tests for passthrough + unsupported-model failure behavior

Suggested acceptance criteria

  • reference_images is forwarded end-to-end for every backend that claims to support image-conditioned editing
  • unsupported backends return an explicit error
  • tests cover URL, data URL, and local file inputs
  • backends that require edit endpoints switch to them automatically when refs are present

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

If reference_images is present, Hermes should do exactly one of these:

  1. Supported backend/model: normalize and forward the references, and use the correct edit endpoint/model variant if required.
  2. Unsupported backend/model: fail clearly with a structured error.

It should not silently degrade to text-only generation.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Bug]: native image generation (image_generate) doesn't accept images as input