hermes - 💡(How to fix) Fix [Bug]: LCM/compression issues with native image replay + auxiliary fallback to depleted OpenRouter

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Non-retryable error (HTTP 400) — trying fallback... Error code: 402 5. If OpenRouter returns a 402 credit error, it should be marked unhealthy for auxiliary fallback instead of being retried repeatedly. Error code: 402

Additional Logs / Traceback (optional)

Error code: 402 skipping openrouter: recent 402 credit error

Root Cause

My active provider was openai-codex / gpt-5.5. MiniMax was also available as a direct fallback. But auxiliary compression/session-search still fell back to OpenRouter because old OpenRouter credentials existed. OpenRouter was depleted, causing repeated 402 errors:

Fix Action

Fix / Workaround

After local mitigation, things stabilized:

A local mitigation was to strip nested image_url.url values beginning with data:image/ before writing/replaying structured content, replacing them with a compact marker.

A local mitigation was changing fallback order from effectively:

Code Example

HTTP 400: Invalid 'input[56].content[1].image_url'. Expected a valid URL, but got a value with an invalid format.
Context: 191 msgs, ~170,664 tokens

---

Non-retryable error (HTTP 400) — trying fallback...
Primary model failed — switching to fallback: claude-sonnet-4.6 via copilot
Context too large (~170,664 tokens)compressing (1/3)...
Compressed 19033 messages, retrying...

---

Error code: 402
This request requires more credits, or fewer max_tokens.
You requested up to 10000 tokens, but can only afford 44.

---

agent:
  image_input_mode: text

---

auxiliary:
  compression:
    provider: openai-codex
    model: gpt-5.5
    base_url: https://chatgpt.com/backend-api/codex
    timeout: 120

---

local/custom → api-key → nous → openrouter

---

api-key => MiniMax-M2.7 https://api.minimax.io/anthropic

---

HTTP 400: Invalid 'input[56].content[1].image_url'. Expected a valid URL, but got a value with an invalid format.

---

Context: 191 msgs, ~170,664 tokens
Context too large (~170,664 tokens)compressing (1/3)...
Compressed 19033 messages, retrying...

---

Error code: 402
This request requires more credits, or fewer max_tokens.
You requested up to 10000 tokens, but can only afford 44.

---

Report       https://paste.rs/UUJwX
agent.log    https://dpaste.com/EUPELZW36
gateway.log  https://dpaste.com/DQDAHRR7A

---

The debug report links above include the relevant logs.

Key visible errors:


HTTP 400: Invalid 'input[56].content[1].image_url'. Expected a valid URL, but got a value with an invalid format.



Context too large (~170,664 tokens)compressing (1/3)...
Compressed 19033 messages, retrying...



Error code: 402
This request requires more credits, or fewer max_tokens.
You requested up to 10000 tokens, but can only afford 44.

---

{
  "type": "image_url",
  "image_url": {
    "url": "data:image/png;base64,..."
  }
}

---

openrouter → nous → local/custom → api-key

---

local/custom → api-key → nous → openrouter

---

api-key => MiniMax-M2.7 https://api.minimax.io/anthropic

---

active provider/model
→ explicit auxiliary provider
→ local/custom
→ direct API-key providers
→ nous
→ openrouter

---

Auxiliary fallback chain for task=session_search:
primary=openai-codex/gpt-5.5
fallbacks=[api-key:minimax/MiniMax-M2.7, nous, openrouter/google/gemini-3-flash-preview]
skipping openrouter: recent 402 credit error
RAW_BUFFERClick to expand / collapse

Bug Description

I hit a compression/context failure in a long-running Discord gateway session using Hermes + hermes-lcm.

The issue appeared after sending screenshots/images in Discord. Hermes routed the image natively, and later replay/history contained an invalid persisted image payload:

HTTP 400: Invalid 'input[56].content[1].image_url'. Expected a valid URL, but got a value with an invalid format.
Context: 191 msgs, ~170,664 tokens

This then triggered model fallback and compression:

Non-retryable error (HTTP 400) — trying fallback...
Primary model failed — switching to fallback: claude-sonnet-4.6 via copilot
Context too large (~170,664 tokens) — compressing (1/3)...
Compressed 190 → 33 messages, retrying...

I think there are two related problems:

  1. Native image payloads can get persisted/replayed in session history.

data:image/...base64 or otherwise invalid image payloads appear to be replayed into future model calls. This can massively inflate context and cause preflight compression / forced overflow recovery.

  1. Auxiliary fallback can route to OpenRouter unexpectedly.

My active provider was openai-codex / gpt-5.5. MiniMax was also available as a direct fallback. But auxiliary compression/session-search still fell back to OpenRouter because old OpenRouter credentials existed. OpenRouter was depleted, causing repeated 402 errors:

Error code: 402
This request requires more credits, or fewer max_tokens.
You requested up to 10000 tokens, but can only afford 44.

After local mitigation, things stabilized:

agent:
  image_input_mode: text
auxiliary:
  compression:
    provider: openai-codex
    model: gpt-5.5
    base_url: https://chatgpt.com/backend-api/codex
    timeout: 120

I also changed auxiliary fallback ordering locally so direct/local providers are tried before OpenRouter:

local/custom → api-key → nous → openrouter

With that, fallback resolved to MiniMax instead of OpenRouter:

api-key => MiniMax-M2.7 https://api.minimax.io/anthropic

Steps to Reproduce

  1. Run Hermes gateway with context.engine: lcm in a long-running Discord session.
  2. Use a vision-capable model/provider so image routing defaults to native multimodal input.
  3. Send one or more screenshots/images in Discord.
  4. Allow the session to grow large enough to trigger preflight compression or restart/auto-continue.
  5. Observe replay/compression behavior.

In my case, the problematic sequence was:

  • Discord image was routed natively.
  • Later replay contained an invalid image_url payload.
  • The main model returned HTTP 400 for the replayed image field.
  • Hermes switched fallback providers.
  • Context was still large, so compression ran.
  • Auxiliary fallback attempted OpenRouter and hit 402 because the OpenRouter account was depleted.

Expected Behavior

Native image payloads should not be persisted/replayed into future model calls in a way that can break request formatting or massively inflate context.

Expected behavior:

  1. Raw data:image/...base64 payloads should be stripped, externalized, or replaced with compact references before session replay.
  2. Gateway image handling should avoid replaying invalid native image blocks.
  3. LCM compression should operate on text/history summaries and compact image metadata, not huge base64 payloads.
  4. Auxiliary fallback should respect active profile intent and prefer configured/direct providers before stale/depleted OpenRouter credentials.
  5. If OpenRouter returns a 402 credit error, it should be marked unhealthy for auxiliary fallback instead of being retried repeatedly.

Actual Behavior

Native image payload/replay appears to have produced an invalid image URL in model input:

HTTP 400: Invalid 'input[56].content[1].image_url'. Expected a valid URL, but got a value with an invalid format.

The session then hit high context pressure:

Context: 191 msgs, ~170,664 tokens
Context too large (~170,664 tokens) — compressing (1/3)...
Compressed 190 → 33 messages, retrying...

Auxiliary fallback also routed to OpenRouter despite the active provider being Codex/GPT-5.5 and MiniMax being available. Since OpenRouter was depleted, this created repeated 402 errors:

Error code: 402
This request requires more credits, or fewer max_tokens.
You requested up to 10000 tokens, but can only afford 44.

LCM eventually recovered via forced overflow recovery, but the behavior was noisy and confusing from the user side.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp)

Messaging Platform (if gateway-related)

Discord

Debug Report

Report       https://paste.rs/UUJwX
agent.log    https://dpaste.com/EUPELZW36
gateway.log  https://dpaste.com/DQDAHRR7A

Operating System

macOS / Darwin 24.6.0 arm64

Python Version

Python 3.11.14

Hermes Version

0.13.0 (2026.5.7) [7d66d30d]

Additional Logs / Traceback (optional)

The debug report links above include the relevant logs.

Key visible errors:


HTTP 400: Invalid 'input[56].content[1].image_url'. Expected a valid URL, but got a value with an invalid format.



Context too large (~170,664 tokens) — compressing (1/3)...
Compressed 19033 messages, retrying...



Error code: 402
This request requires more credits, or fewer max_tokens.
You requested up to 10000 tokens, but can only afford 44.

Root Cause Analysis (optional)

My read is that there are two interacting root causes:

  1. Native image replay/persistence

When Discord images are routed as native multimodal input, structured content can include image blocks such as:

{
  "type": "image_url",
  "image_url": {
    "url": "data:image/png;base64,..."
  }
}

If these are persisted and replayed as normal session history, they can either:

  • produce invalid image URL errors on later model calls
  • massively inflate token/context estimates
  • trigger repeated preflight compression or forced overflow recovery

A local mitigation was to strip nested image_url.url values beginning with data:image/ before writing/replaying structured content, replacing them with a compact marker.

  1. Auxiliary fallback ordering

Auxiliary fallback appeared to prefer OpenRouter before direct/local providers. Since I had stale/depleted OpenRouter credentials, fallback routed compression/session-search to OpenRouter even though the active provider was Codex/GPT-5.5 and MiniMax was available.

A local mitigation was changing fallback order from effectively:

openrouter → nous → local/custom → api-key

to:

local/custom → api-key → nous → openrouter

After that, fallback resolved to MiniMax instead of OpenRouter:

api-key => MiniMax-M2.7 https://api.minimax.io/anthropic

Proposed Fix (optional)

Suggested fixes:

  1. Do not persist/replay raw native image payloads.

Strip or externalize data:image/...base64 before session persistence/replay. Store a compact marker, local file path, attachment ID, content hash, MIME type, or similar metadata instead.

  1. Consider safer gateway defaults/docs.

For long-running Discord/Telegram sessions with LCM enabled, agent.image_input_mode: text may be safer than native image routing because it pre-analyzes images and passes text summaries into the main model.

  1. Revisit auxiliary fallback ordering.

Prefer:

active provider/model
→ explicit auxiliary provider
→ local/custom
→ direct API-key providers
→ nous
→ openrouter

OpenRouter should probably be last resort if it is not the active provider for the profile.

  1. Mark depleted OpenRouter temporarily unhealthy.

If OpenRouter returns 402 / “can only afford N tokens,” auxiliary fallback should skip it for a while instead of repeatedly retrying it.

  1. Improve fallback diagnostics.

It would help if logs showed the actual auxiliary fallback chain for each task, for example:

Auxiliary fallback chain for task=session_search:
primary=openai-codex/gpt-5.5
fallbacks=[api-key:minimax/MiniMax-M2.7, nous, openrouter/google/gemini-3-flash-preview]
skipping openrouter: recent 402 credit error

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Bug]: LCM/compression issues with native image replay + auxiliary fallback to depleted OpenRouter