hermes - 💡(How to fix) Fix [Bug]: Regression in #16506: Anthropic 400 invalid_request_error on Discord image attachments — media_type mismatched with actual bytes [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

The 400 isn't visible in normal logs — only HERMES_DUMP_REQUESTS=1 surfaces the underlying error. After capturing the outgoing request body and replaying ⚠️ Non-retryable error (HTTP 400) — trying fallback...

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fixed

Code Example

Report       https://paste.rs/vRqfP
  agent.log    https://paste.rs/ckpFE
  gateway.log  https://paste.rs/hR3Xy

---

2026-05-06 20:23:41,282 INFO gateway.run: inbound message: platform=discord user=... msg='can you see images?'
  2026-05-06 20:23:41,858 INFO gateway.run: Image routing: native (model supports vision). 1 image(s) will be attached inline.
  2026-05-06 20:23:45,744 INFO [...] root: Fallback activated: claude-haiku-4-5-20251001 → anthropic/claude-sonnet-4 (openrouter)
RAW_BUFFERClick to expand / collapse

Bug Description

Since v0.12.0, every Discord image attachment forwarded to a vision-capable Anthropic model is rejected by Anthropic with HTTP 400. Hermes labels the image part with the Content-Type Discord reports for the attachment, but Discord sometimes reports a MIME (e.g. image/webp) that doesn't match the actual file bytes (e.g. PNG). Anthropic now validates media_type against the file's magic bytes and rejects mismatches.

Steps to Reproduce

  1. Configure Hermes with a native Anthropic provider (raw sk-ant-api03-… key) on a vision model — e.g. claude-haiku-4-5-20251001.
  2. Send a message + image attachment to the bot via Discord.
  3. The Anthropic call 400s immediately; Hermes falls back to the configured fallback provider.

Expected Behavior

Hermes should send the Anthropic image part with media_type matching the actual image bytes, so the request succeeds (HTTP 200) on the configured native Anthropic provider instead of silently failing over to a fallback provider.

Concretely: when a Discord attachment's reported Content-Type disagrees with the file's magic bytes, Hermes should trust the bytes and set source.media_type accordingly — Anthropic accepts the request, the call is billed against the user's prepaid Anthropic balance, and no fallback is needed.

Actual Behavior

The Anthropic call failed with HTTP 400 and Hermes silently rolled over to the configured fallback provider (OpenRouter, anthropic/claude-sonnet-4). The bot still answered, so from the Discord side it looked successful, but every image request was bypassing the native Anthropic provider.

The 400 isn't visible in normal logs — only HERMES_DUMP_REQUESTS=1 surfaces the underlying error. After capturing the outgoing request body and replaying it directly against https://api.anthropic.com/v1/messages, Anthropic returned:

HTTP 400 invalid_request_error messages.0.content.1.image.source.base64: The image was specified using the image/webp media type, but the image appears to be a image/png image

The same body returns HTTP 200 OK once source.media_type is corrected to match the actual image bytes (image/png in this reproduction). This was confirmed by sending the corrected payload back through curl against the same endpoint with the same API key.

User-visible status messages emitted by Hermes during the failure:

⚠️ Non-retryable error (HTTP 400) — trying fallback... 🔄 Primary model failed — switching to fallback: anthropic/claude-sonnet-4 via openrouter

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp)

Messaging Platform (if gateway-related)

Discord

Debug Report

Report       https://paste.rs/vRqfP
  agent.log    https://paste.rs/ckpFE
  gateway.log  https://paste.rs/hR3Xy

Operating System

ubuntu 24.04.4 LTS

Python Version

3.11.15

Hermes Version

v0.12.0

Additional Logs / Traceback (optional)

2026-05-06 20:23:41,282 INFO gateway.run: inbound message: platform=discord user=... msg='can you see images?'
  2026-05-06 20:23:41,858 INFO gateway.run: Image routing: native (model supports vision). 1 image(s) will be attached inline.
  2026-05-06 20:23:45,744 INFO [...] root: Fallback activated: claude-haiku-4-5-20251001 → anthropic/claude-sonnet-4 (openrouter)

Root Cause Analysis (optional)

Root cause

Introduced in #16506 (feat(image-input): native multimodal routing based on model vision capability, shipped in v0.12.0).

gateway/platforms/discord.py ~lines 4100–4112:

content_type = att.content_type or "unknown" if content_type.startswith("image/"): try: ext = "." + content_type.split("/")[-1].split(";")[0] if ext not in (".jpg", ".jpeg", ".png", ".gif", ".webp"): ext = ".jpg" cached_path = await self._cache_discord_image(att, ext) media_urls.append(cached_path) media_types.append(content_type) # ← Discord-reported MIME, never verified

media_types[i] flows into the Anthropic image part as source.media_type. There is no verification step against the actual bytes. The previous (pre-#16506) routing path didn't expose this code path the same way, which is why the bug didn't surface until v0.12.0.

Why it surfaces now

Anthropic appears to have tightened media_type-vs-magic-byte validation around 2026-05-05; the same payloads were accepted previously. The combination of (a) #16506's new native-multimodal routing and (b) Anthropic's new strictness is what makes this 100% reproducible now.

Proposed Fix (optional)

Suggested fix

Sniff magic bytes once and use that as the canonical MIME, falling back to att.content_type only when the bytes are unrecognized. A 5-line helper covers all four formats Hermes already declares as supported:

def _sniff_image_mime(head: bytes) -> str | None: if head.startswith(b"\x89PNG\r\n\x1a\n"): return "image/png" if head.startswith(b"\xff\xd8\xff"): return "image/jpeg" if head[:6] in (b"GIF87a", b"GIF89a"): return "image/gif" if len(head) >= 12 and head[:4] == b"RIFF" and head[8:12] == b"WEBP": return "image/webp" return None

Cleanest integration: do the sniff inside _cache_discord_image (it already reads bytes via _read_attachment_bytes) and return (cached_path, sniffed_mime); the caller uses sniffed_mime instead of att.content_type whenever available. Single download, single source of truth, fixes every downstream consumer of media_types[].

Impact

  • Total breakage on affected attachments — silent fallback to a non-Anthropic provider for every Discord image.
  • Affects anyone running Hermes Discord platform with a native Anthropic provider on a vision-capable model (claude-haiku-4-5-, claude-sonnet-4-, claude-opus-4-*).
  • Wasted fallback cost: requests succeed via the fallback (e.g. OpenRouter Sonnet 4) but bypass any prepaid Anthropic balance.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING