hermes - ✅(Solved) Fix QQBot: image attachments with content_type=file are treated as generic files and never reach the vision pipeline [1 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#14324Fetched 2026-04-24 06:17:49
View on GitHub
Comments
2
Participants
2
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
labeled ×5commented ×2cross-referenced ×1

On Hermes 563ed0e6, QQ C2C attachments whose payload comes through as content_type=file are never treated as images, even when the filename is clearly an image (for example .png).

As a result, image-bearing QQ messages can enter the agent as plain text plus [Attachment: ...] notes instead of media_urls/media_types, so the normal vision enrichment path never runs.

Root Cause

  • images=0, voice=0 after processing
  • the session gets [Attachment: ...png] style text instead of an image path in media_urls
  • downstream agent behavior can go badly off course because the design image never enters _prepare_inbound_message_text() as an image for vision analysis

Fix Action

Fixed

PR fix notes

PR #14330: fix(qqbot): infer image mime for generic file attachments

Description (problem / solution / changelog)

Summary

  • infer a concrete MIME type from QQ attachment filenames when QQ reports generic file content types
  • route inferred image attachments through the existing image cache path so they populate media_urls/media_types
  • add regression coverage for generic file image attachments and non-image file attachments

Testing

  • python3 -m pytest -o addopts='' tests/gateway/test_qqbot.py

Closes #14324

Changed files

  • gateway/platforms/qqbot/adapter.py (modified, +13/-2)
  • tests/gateway/test_qqbot.py (modified, +72/-0)

Code Example

2026-04-23 11:22:21,783 attachment[0]: content_type=file ... filename=icons_512_whitefeather.zip
2026-04-23 11:22:22,186 After processing: images=0, voice=0
2026-04-23 11:22:22,306 attachment[0]: content_type=file ... filename=ChatGPT Image 202642216_20_44.png
2026-04-23 11:22:22,760 After processing: images=0, voice=0

---

elif ct.startswith("image/"):
    cached_path = await self._download_and_cache(url, ct)
    ...
else:
    cached_path = await self._download_and_cache(url, ct)
    other_attachments.append(f"[Attachment: {filename or ct}]")

---

if content_type.startswith("image/"):
    return cache_image_from_bytes(data, ext)
...
else:
    return cache_document_from_bytes(data, filename)

---

2026-04-22 11:26:07,916 attachment[0]: content_type=image/png ... filename=F8CE9C97E15824BE3AC4D4933857C7C3.png
2026-04-22 11:26:08,172 After processing: images=1, voice=0
RAW_BUFFERClick to expand / collapse

Summary

On Hermes 563ed0e6, QQ C2C attachments whose payload comes through as content_type=file are never treated as images, even when the filename is clearly an image (for example .png).

As a result, image-bearing QQ messages can enter the agent as plain text plus [Attachment: ...] notes instead of media_urls/media_types, so the normal vision enrichment path never runs.

Environment

  • Hermes commit: 563ed0e6
  • Platform: qqbot
  • Observed on macOS

Expected behavior

If a QQ attachment has a filename like *.png, *.jpg, *.jpeg, *.webp, or *.gif, Hermes should treat it as an image even when QQ reports content_type=file, so the attachment is cached as an image and passed into the normal vision pipeline.

Actual behavior

The attachment is handled as a generic document/file instead of an image:

  • images=0, voice=0 after processing
  • the session gets [Attachment: ...png] style text instead of an image path in media_urls
  • downstream agent behavior can go badly off course because the design image never enters _prepare_inbound_message_text() as an image for vision analysis

Reproduction

  1. Start Hermes gateway with qqbot enabled.
  2. In a QQ C2C chat, send:
    • a text instruction that refers to a design image
    • a zip attachment
    • a PNG attachment
  3. Observe the QQ payload/logs for the PNG attachment:
    • content_type=file
    • filename=...png
  4. Hermes logs show After processing: images=0, voice=0 for that PNG message.

Evidence

Observed failing case:

2026-04-23 11:22:21,783 attachment[0]: content_type=file ... filename=icons_512_whitefeather.zip
2026-04-23 11:22:22,186 After processing: images=0, voice=0
2026-04-23 11:22:22,306 attachment[0]: content_type=file ... filename=ChatGPT Image 2026年4月22日 16_20_44.png
2026-04-23 11:22:22,760 After processing: images=0, voice=0

Code path only treats content_type.startswith("image/") as image input:

  • gateway/platforms/qqbot/adapter.py:1146-1194
  • gateway/platforms/qqbot/adapter.py:1228-1237

Relevant code excerpt:

elif ct.startswith("image/"):
    cached_path = await self._download_and_cache(url, ct)
    ...
else:
    cached_path = await self._download_and_cache(url, ct)
    other_attachments.append(f"[Attachment: {filename or ct}]")

And _download_and_cache() also relies on content_type.startswith("image/"):

if content_type.startswith("image/"):
    return cache_image_from_bytes(data, ext)
...
else:
    return cache_document_from_bytes(data, filename)

Useful contrast

The same adapter works correctly when QQ sends a real image MIME type:

2026-04-22 11:26:07,916 attachment[0]: content_type=image/png ... filename=F8CE9C97E15824BE3AC4D4933857C7C3.png
2026-04-22 11:26:08,172 After processing: images=1, voice=0

So this looks specifically like a QQ payload-shape gap for attachments that are actually images but arrive as content_type=file.

Suspected fix

In the QQ adapter, add filename/extension-based fallback classification for attachments whose content_type is generic (file, application/octet-stream, empty, etc.).

Likely places:

  • gateway/platforms/qqbot/adapter.py::_process_attachments()
  • gateway/platforms/qqbot/adapter.py::_download_and_cache()

Something along the lines of:

  • if content_type is not image/audio/voice but filename ends with an image extension, treat it as image
  • optionally sniff the downloaded bytes as a secondary fallback

Impact

This breaks a common real workflow for QQ-based Hermes usage: sending a design screenshot / mockup / reference image to drive frontend work. The agent receives the task text, but not the actual image input, so it can analyze the wrong local files or proceed without the intended visual context.

extent analysis

TL;DR

The issue can be fixed by adding a filename/extension-based fallback classification for attachments in the QQ adapter to treat files with image extensions as images even when the content_type is generic.

Guidance

  • Modify the gateway/platforms/qqbot/adapter.py file to add a check for image extensions (e.g., .png, .jpg, .jpeg, .webp, .gif) in the filename when the content_type is not explicitly an image type.
  • Update the _process_attachments() and _download_and_cache() methods to handle this fallback classification.
  • Consider adding a secondary fallback to sniff the downloaded bytes to determine the file type if the filename and content_type are inconclusive.

Example

elif ct.startswith("image/"):
    cached_path = await self._download_and_cache(url, ct)
    ...
elif filename and filename.lower().endswith(('.png', '.jpg', '.jpeg', '.webp', '.gif')):
    cached_path = await self._download_and_cache(url, 'image/' + filename.split('.')[-1])
    ...
else:
    cached_path = await self._download_and_cache(url, ct)
    other_attachments.append(f"[Attachment: {filename or ct}]")

Notes

This fix assumes that the filename extension is a reliable indicator of the file type. However, this may not always be the case, and additional error handling may be necessary.

Recommendation

Apply the workaround by modifying the QQ adapter to add filename/extension-based fallback classification, as this will allow the agent to correctly handle image attachments with generic content_type values.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

If a QQ attachment has a filename like *.png, *.jpg, *.jpeg, *.webp, or *.gif, Hermes should treat it as an image even when QQ reports content_type=file, so the attachment is cached as an image and passed into the normal vision pipeline.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix QQBot: image attachments with content_type=file are treated as generic files and never reach the vision pipeline [1 pull requests, 2 comments, 2 participants]