openclaw - ✅(Solved) Fix Discord voice messages not recognized as audio (content_type missing or incorrect) [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#64803Fetched 2026-04-12 13:26:41
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
cross-referenced ×2referenced ×1

Fix Action

Fix / Workaround

Workaround: Set requireMention: true and speak the bot name in the voice message - but this defeats the purpose of requireMention: false channels.

PR fix notes

PR #64848: fix(discord): recognize voice messages as audio by filename extension

Description (problem / solution / changelog)

Summary

Discord voice messages (OGG/Opus) sometimes arrive with a missing or generic content_type, causing inferPlaceholder to classify them as <media:document> instead of <media:audio>. This prevents the audio transcription pipeline (tools.media.audio) from triggering.

Changes

Add a filename extension fallback in inferPlaceholder, matching the same pattern already used by isImageAttachment. When content_type does not match a known media prefix, check the attachment filename for common audio extensions (.ogg, .opus, .mp3, .wav, etc.) and video extensions before falling through to <media:document>.

Testing

  • Voice message with content_type: audio/ogg → still works (MIME check)
  • Voice message with empty content_type but filename voice-message.ogg → now recognized as <media:audio>
  • Regular document with .pdf extension → still <media:document>

Fixes #64803

Changed files

  • extensions/discord/src/monitor/message-utils.ts (modified, +11/-0)

PR #64863: fix(discord): recognize voice messages as audio by filename extension

Description (problem / solution / changelog)

Summary

Discord voice messages (OGG/Opus) sometimes arrive with missing content_type, causing inferPlaceholder to classify them as <media:document> instead of <media:audio>, preventing audio transcription.

Changes

Add filename extension fallback in inferPlaceholder (same pattern as isImageAttachment). Checks for audio extensions (.ogg, .opus, .mp3, .wav, etc.) and video extensions when content_type is missing.

Fixes #64803

Changed files

  • extensions/discord/src/monitor/message-utils.ts (modified, +11/-0)

Code Example

function inferPlaceholder(attachment) {
    const mime = attachment.content_type ?? "";
    if (mime.startsWith("audio/")) return "<media:audio>";
    return "<media:document>";
}
RAW_BUFFERClick to expand / collapse

Bug Report

Discord voice messages are received as <media:document> instead of <media:audio>, preventing audio transcription from triggering.

Environment:

  • OpenClaw 2026.4.10
  • Discord channel with requireMention: false
  • tools.media.audio configured with OpenAI Whisper

Problem: In extensions/discord/src/monitor/message-utils.ts, the inferPlaceholder function checks attachment.content_type:

function inferPlaceholder(attachment) {
    const mime = attachment.content_type ?? "";
    if (mime.startsWith("audio/")) return "<media:audio>";
    return "<media:document>";
}

Discord voice messages should have content_type: "audio/ogg" but arrive as <media:document>.

Related:

  • PR #32136 fixed the Preflight-Path for requireMention: true
  • Issue #30034 was closed as resolved, but only Preflight was fixed
  • The main transcription pipeline still doesn't recognize Discord voice messages as audio

Expected: Discord voice messages (OGG/Opus with IS_VOICE_MESSAGE flag) should be recognized as <media:audio> and trigger tools.media.audio transcription regardless of requireMention setting.

Workaround: Set requireMention: true and speak the bot name in the voice message - but this defeats the purpose of requireMention: false channels.

extent analysis

TL;DR

Update the inferPlaceholder function to correctly identify Discord voice messages as <media:audio> based on the presence of the IS_VOICE_MESSAGE flag.

Guidance

  • Review the inferPlaceholder function in extensions/discord/src/monitor/message-utils.ts to ensure it checks for the IS_VOICE_MESSAGE flag in addition to the content_type.
  • Verify that the attachment object contains the necessary information (e.g., IS_VOICE_MESSAGE flag, content_type) to make an accurate determination.
  • Consider adding a check for the IS_VOICE_MESSAGE flag in the inferPlaceholder function, similar to the existing mime.startsWith("audio/") check.
  • Test the updated inferPlaceholder function with various types of Discord messages, including voice messages with the IS_VOICE_MESSAGE flag.

Example

function inferPlaceholder(attachment) {
    const mime = attachment.content_type ?? "";
    const isVoiceMessage = attachment.flags && attachment.flags.IS_VOICE_MESSAGE;
    if (mime.startsWith("audio/") || isVoiceMessage) return "<media:audio>";
    return "<media:document>";
}

Notes

The provided code snippet assumes that the attachment object contains a flags property with the IS_VOICE_MESSAGE flag. If this is not the case, additional modifications may be necessary to access the flag.

Recommendation

Apply workaround: Update the inferPlaceholder function to check for the IS_VOICE_MESSAGE flag, as this will allow Discord voice messages to be correctly identified as <media:audio> and trigger transcription.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING