hermes - 💡(How to fix) Fix QQ Bot: Audio files sent via file transfer are incorrectly routed to STT pipeline [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

In gateway/platforms/qqbot/adapter.py, the _is_voice_content_type() method has a filename-extension fallback that matches common audio extensions (.silk, .amr, .mp3, .wav, .ogg, .m4a, .aac, .speex, .flac) even when the QQ Bot API explicitly reports content_type="file".

# Current buggy code (line 1790-1809)
@staticmethod
def _is_voice_content_type(content_type: str, filename: str) -> bool:
    ct = content_type.strip().lower()
    fn = filename.strip().lower()
    if ct == "voice" or ct.startswith("audio/"):
        return True
    _VOICE_EXTENSIONS = (".silk", ".amr", ".mp3", ".wav", ".ogg", ".m4a", ".aac", ".speex", ".flac")
    if any(fn.endswith(ext) for ext in _VOICE_EXTENSIONS):
        return True  # ← Bug: matches even when content_type="file"
    return False

Fix Action

Fixed

Code Example

# Current buggy code (line 1790-1809)
@staticmethod
def _is_voice_content_type(content_type: str, filename: str) -> bool:
    ct = content_type.strip().lower()
    fn = filename.strip().lower()
    if ct == "voice" or ct.startswith("audio/"):
        return True
    _VOICE_EXTENSIONS = (".silk", ".amr", ".mp3", ".wav", ".ogg", ".m4a", ".aac", ".speex", ".flac")
    if any(fn.endswith(ext) for ext in _VOICE_EXTENSIONS):
        return True  # ← Bug: matches even when content_type="file"
    return False

---

{"content_type": "voice", "url": "https://multimedia.nt.qq.com.cn/...", "filename": "xxx.amr"}

---

{"content_type": "file", "url": "https://grouptalk.c2c.qq.com/...", "filename": "audio-30251.instrumental..wav"}
RAW_BUFFERClick to expand / collapse

Bug Description

When a user sends an audio file (e.g. .wav, .mp3, .ogg) through QQ's file transfer feature (not voice message), the attachment is incorrectly routed through the speech-to-text (STT) pipeline instead of being saved as a regular file attachment.

Root Cause

In gateway/platforms/qqbot/adapter.py, the _is_voice_content_type() method has a filename-extension fallback that matches common audio extensions (.silk, .amr, .mp3, .wav, .ogg, .m4a, .aac, .speex, .flac) even when the QQ Bot API explicitly reports content_type="file".

# Current buggy code (line 1790-1809)
@staticmethod
def _is_voice_content_type(content_type: str, filename: str) -> bool:
    ct = content_type.strip().lower()
    fn = filename.strip().lower()
    if ct == "voice" or ct.startswith("audio/"):
        return True
    _VOICE_EXTENSIONS = (".silk", ".amr", ".mp3", ".wav", ".ogg", ".m4a", ".aac", ".speex", ".flac")
    if any(fn.endswith(ext) for ext in _VOICE_EXTENSIONS):
        return True  # ← Bug: matches even when content_type="file"
    return False

QQ Bot API Attachment Structure

The QQ Bot API already distinguishes voice messages from file uploads via the content_type field:

Attachment Typecontent_type
Voice message"voice"
File upload"file"
Image"image/jpeg", "image/png", "image/gif"
Video"video/mp4"

Voice message example:

{"content_type": "voice", "url": "https://multimedia.nt.qq.com.cn/...", "filename": "xxx.amr"}

File upload example:

{"content_type": "file", "url": "https://grouptalk.c2c.qq.com/...", "filename": "audio-30251.instrumental..wav"}

Impact

  • Audio files sent via QQ file transfer are never saved — they are consumed by the STT pipeline
  • STT fails on non-speech audio (music, sound effects, etc.), returning [Voice] [语音识别失败]
  • The original file is lost — not cached, not forwarded to the agent
  • This bug is QQ-specific — Telegram, Discord, and WeChat adapters use platform-native message types and are not affected

Steps to Reproduce

  1. Connect a QQ Bot
  2. Send an audio file (e.g. .wav, .mp3) via QQ's file transfer feature (not voice message)
  3. The bot receives [Voice] [语音识别失败] instead of the file

Proposed Fix

Remove the _VOICE_EXTENSIONS fallback. Since the QQ Bot API explicitly sets content_type="voice" for voice messages, filename-extension sniffing is unnecessary and harmful.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix QQ Bot: Audio files sent via file transfer are incorrectly routed to STT pipeline [1 pull requests]