hermes - 💡(How to fix) Fix QQ Bot: Audio files sent via file transfer are incorrectly routed to STT pipeline [1 pull requests]

hermes2026-05-31 05:22:05

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Root Cause

In gateway/platforms/qqbot/adapter.py, the _is_voice_content_type() method has a filename-extension fallback that matches common audio extensions (.silk, .amr, .mp3, .wav, .ogg, .m4a, .aac, .speex, .flac) even when the QQ Bot API explicitly reports content_type="file".

# Current buggy code (line 1790-1809)
@staticmethod
def _is_voice_content_type(content_type: str, filename: str) -> bool:
    ct = content_type.strip().lower()
    fn = filename.strip().lower()
    if ct == "voice" or ct.startswith("audio/"):
        return True
    _VOICE_EXTENSIONS = (".silk", ".amr", ".mp3", ".wav", ".ogg", ".m4a", ".aac", ".speex", ".flac")
    if any(fn.endswith(ext) for ext in _VOICE_EXTENSIONS):
        return True  # ← Bug: matches even when content_type="file"
    return False

Fix Action

Fixed

Fixed by PR: fix(qqbot): stop routing file uploads through STT pipeline (https://github.com/NousResearch/hermes-agent/pull/35705)

Code Example

# Current buggy code (line 1790-1809)
@staticmethod
def _is_voice_content_type(content_type: str, filename: str) -> bool:
    ct = content_type.strip().lower()
    fn = filename.strip().lower()
    if ct == "voice" or ct.startswith("audio/"):
        return True
    _VOICE_EXTENSIONS = (".silk", ".amr", ".mp3", ".wav", ".ogg", ".m4a", ".aac", ".speex", ".flac")
    if any(fn.endswith(ext) for ext in _VOICE_EXTENSIONS):
        return True  # ← Bug: matches even when content_type="file"
    return False

---

{"content_type": "voice", "url": "https://multimedia.nt.qq.com.cn/...", "filename": "xxx.amr"}

---

{"content_type": "file", "url": "https://grouptalk.c2c.qq.com/...", "filename": "audio-30251.instrumental..wav"}

RAW_BUFFERClick to expand / collapse

Bug Description

When a user sends an audio file (e.g. .wav, .mp3, .ogg) through QQ's file transfer feature (not voice message), the attachment is incorrectly routed through the speech-to-text (STT) pipeline instead of being saved as a regular file attachment.

Root Cause

# Current buggy code (line 1790-1809)
@staticmethod
def _is_voice_content_type(content_type: str, filename: str) -> bool:
    ct = content_type.strip().lower()
    fn = filename.strip().lower()
    if ct == "voice" or ct.startswith("audio/"):
        return True
    _VOICE_EXTENSIONS = (".silk", ".amr", ".mp3", ".wav", ".ogg", ".m4a", ".aac", ".speex", ".flac")
    if any(fn.endswith(ext) for ext in _VOICE_EXTENSIONS):
        return True  # ← Bug: matches even when content_type="file"
    return False

QQ Bot API Attachment Structure

The QQ Bot API already distinguishes voice messages from file uploads via the content_type field:

Attachment Type	content_type
Voice message	`"voice"`
File upload	`"file"`
Image	`"image/jpeg"`, `"image/png"`, `"image/gif"`
Video	`"video/mp4"`

Voice message example:

{"content_type": "voice", "url": "https://multimedia.nt.qq.com.cn/...", "filename": "xxx.amr"}

File upload example:

{"content_type": "file", "url": "https://grouptalk.c2c.qq.com/...", "filename": "audio-30251.instrumental..wav"}

Impact

Audio files sent via QQ file transfer are never saved — they are consumed by the STT pipeline
STT fails on non-speech audio (music, sound effects, etc.), returning [Voice] [语音识别失败]
The original file is lost — not cached, not forwarded to the agent
This bug is QQ-specific — Telegram, Discord, and WeChat adapters use platform-native message types and are not affected

Steps to Reproduce

Connect a QQ Bot
Send an audio file (e.g. .wav, .mp3) via QQ's file transfer feature (not voice message)
The bot receives [Voice] [语音识别失败] instead of the file

Proposed Fix

Remove the _VOICE_EXTENSIONS fallback. Since the QQ Bot API explicitly sets content_type="voice" for voice messages, filename-extension sniffing is unnecessary and harmful.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix QQ Bot: Audio files sent via file transfer are incorrectly routed to STT pipeline [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

Code Example

Bug Description

Root Cause

QQ Bot API Attachment Structure

Impact

Steps to Reproduce

Proposed Fix

Still need to ship something?

TRENDING