hermes - 💡(How to fix) Fix Pre-resize inbound images in gateway before they enter context window

hermes2026-05-17 15:24:24

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

The existing auto-resize logic in tools/vision_tools.py (_resize_image_for_vision) only triggers reactively after an API call fails with a size error. By that point, the oversized image has already been encoded into the conversation context.

Fix Action

Fix / Workaround

Local Patch

I have a working local patch (adds _downscale_image_bytes() helper called from cache_image_from_bytes()) that I'm happy to submit as a PR if this approach looks right.

RAW_BUFFERClick to expand / collapse

Problem

When users send screenshots (especially retina/HiDPI) via Telegram or other gateway platforms, the full-resolution image bytes are cached as-is and then base64-encoded into the context window. A typical CleanShot @2x screenshot is 3-5MB and 3000+ pixels wide — way more resolution than any vision model needs, and it burns a huge number of tokens.

Proposed Solution

Downscale images proactively in cache_image_from_bytes() in gateway/platforms/base.py before they're saved to the cache. This is the single funnel point where all platform adapters (Telegram, Discord, Signal, Matrix, etc.) pass through, so one change covers all platforms.

Suggested behavior:

If either dimension exceeds a configurable threshold (default: 1600px), downscale proportionally using Lanczos resampling
JPEG output (quality 85) for opaque images, PNG for images with alpha
Pillow as a soft dependency — if not installed, pass through unchanged
Add a config key like gateway.max_image_dimension: 1600 so users can tune it

Impact

Significantly reduces token consumption for image-heavy workflows (recipe screenshots, UI reviews, photo logging)
Eliminates the reactive resize-on-failure path for most cases
No loss of useful information — vision models don't benefit from >1600px resolution

Local Patch

I have a working local patch (adds _downscale_image_bytes() helper called from cache_image_from_bytes()) that I'm happy to submit as a PR if this approach looks right.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #embedding generation #cache error #pipeline error #runtime error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Pre-resize inbound images in gateway before they enter context window

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

Local Patch

Problem

Proposed Solution

Impact

Local Patch

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Pre-resize inbound images in gateway before they enter context window

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

Local Patch

Problem

Proposed Solution

Impact

Local Patch

Still need to ship something?

RELATED_DISCOVERY

TRENDING