hermes - 💡(How to fix) Fix feat(vision): add agent.max_vision_images_in_context config to limit base64 image accumulation [1 comments, 1 participants]

bluuewhale · 2026-05-19T04:15:12Z

[hermes] Problem When using vision analyze native vision fast path in a multi-image session, base64 image payloads accumulate in the conversation context indef… ## Fix / Workaround ## Workarounds (current) ## Problem When using `vision_analyze` (native vision fast path) in a multi-image session, base64 image payloads accumulate in the conversation context indefinitely. Each image can be 1–10 MB of base64, and 7+ images in a single session easily reaches **14M chars (~4M tokens)** — consuming the entire context window. ## Root Cause The native vision fast path stores images as `{"type": "image_url", "image_url": {"url": "data:image/..."}}` in the `tool` role message. These blobs remain in the context history on every subsequent turn. The context compressor (`_prune_old_tool_results`) only strips them when `compression.enabled: true` and the context exceeds the compression threshold. ### Observed session - 7 screenshots in one session: 14.5M chars / ~4M tokens - Image ratio: ~100% of context - `compression.enabled: false` was the default → zero pruning ## Proposed Solution Add a config knob: `agent.max_vision_images_in_context: N` (default 3–5). When the agent loop appends a new native vision tool result, proactively replace older image payloads beyond the N most recent with `[screenshot removed — kept N most recent to save context]`. This should happen **before** the API call, not after context overflow. Implementation hint: after appending a new `_multimodal` tool result to `messages`, scan backward for `image_url` type parts and strip those beyond position N, similar to `_strip_image_parts_from_parts`. ## Workarounds (current) 1. `compression.enabled: true` + `threshold: 0.30` — strips old images at 30% context usage 2. `agent.image_input_mode: text` — converts images to text descriptions upfront (quality tradeoff) ## Impact - Sessions with repeated screenshots (UI QA, slide review) hit context limit silently - No user-visible warning until API returns 400/413

hermes2026-05-19 04:15:12

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#28446•Fetched 2026-05-20 04:03:42

View on GitHub

Comments

Participants

Timeline

Reactions

Author

bluuewhale

Participants

bluuewhale

Timeline (top)

labeled ×4mentioned ×1subscribed ×1

Root Cause

The native vision fast path stores images as {"type": "image_url", "image_url": {"url": "data:image/..."}} in the tool role message. These blobs remain in the context history on every subsequent turn. The context compressor (_prune_old_tool_results) only strips them when compression.enabled: true and the context exceeds the compression threshold.

Fix Action

Fix / Workaround

Workarounds (current)

RAW_BUFFERClick to expand / collapse

Problem

When using vision_analyze (native vision fast path) in a multi-image session, base64 image payloads accumulate in the conversation context indefinitely. Each image can be 1–10 MB of base64, and 7+ images in a single session easily reaches 14M chars (~4M tokens) — consuming the entire context window.

Root Cause

Observed session

7 screenshots in one session: 14.5M chars / ~4M tokens
Image ratio: ~100% of context
compression.enabled: false was the default → zero pruning

Proposed Solution

Add a config knob: agent.max_vision_images_in_context: N (default 3–5).

When the agent loop appends a new native vision tool result, proactively replace older image payloads beyond the N most recent with [screenshot removed — kept N most recent to save context]. This should happen before the API call, not after context overflow.

Implementation hint: after appending a new _multimodal tool result to messages, scan backward for image_url type parts and strip those beyond position N, similar to _strip_image_parts_from_parts.

Workarounds (current)

compression.enabled: true + threshold: 0.30 — strips old images at 30% context usage
agent.image_input_mode: text — converts images to text descriptions upfront (quality tradeoff)

Impact

Sessions with repeated screenshots (UI QA, slide review) hit context limit silently
No user-visible warning until API returns 400/413

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #network issue #logging issue #authentication issue #prompt issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix feat(vision): add agent.max_vision_images_in_context config to limit base64 image accumulation [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workarounds (current)

Problem

Root Cause

Observed session

Proposed Solution

Workarounds (current)

Impact

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix feat(vision): add agent.max_vision_images_in_context config to limit base64 image accumulation [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workarounds (current)

Problem

Root Cause

Observed session

Proposed Solution

Workarounds (current)

Impact

Still need to ship something?

RELATED_DISCOVERY

TRENDING