openclaw - 💡(How to fix) Fix [Feishu] TTS voice bubble: MiniMax provider does not support voice-compatible output [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70445Fetched 2026-04-23 07:24:43
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0

Root Cause

  1. Feishu only renders msg_type: "audio" as a voice bubble.
    In openclaw-lark/src/messaging/outbound/media.js, the uploadAndSendMediaLark function routes .opus / .ogg files to sendAudioLark (which uses msg_type: "audio"), while all other extensions fall through to sendFileLark (msg_type: "file").

  2. The MiniMax TTS provider always outputs mp3.
    In dist/speech-provider-CNa1803L.js, the MiniMax speech provider's synthesize() method hardcodes:

    outputFormat: "mp3",
    fileExtension: ".mp3",
    voiceCompatible: false

    This means it never produces .opus audio and never signals that the result is voice-compatible, so Feishu never receives it as a native voice bubble.

  3. OpenClaw core already supports this path for Feishu.
    The speech-core runtime in dist/extensions/speech-core/runtime-api.js already has supportsNativeVoiceNoteTts("feishu") === true, and the Feishu extension already has the correct routing logic — it just needs .opus input.

  4. MiniMax API itself is flexible on format.
    The MiniMax T2A HTTP API accepts an audio_setting.format parameter. While the default is mp3, the API also supports pcm, flac, and wav. An Opus output would satisfy Feishu's requirement.

Fix Action

Workaround

None that are user-friendly — currently requires hot-patching the dist file or switching TTS providers.

Code Example

outputFormat: "mp3",
   fileExtension: ".mp3",
   voiceCompatible: false

---

outputFormat: "opus",
   fileExtension: ".opus",
   voiceCompatible: true
RAW_BUFFERClick to expand / collapse

Problem

When using the MiniMax TTS provider with Feishu as the target channel, voice messages are received as .mp3 file attachments instead of native Feishu voice bubbles.

Expected behavior

Feishu should receive the TTS audio as a playable voice bubble (msg_type: "audio"), not as a generic file attachment.

Root cause analysis

  1. Feishu only renders msg_type: "audio" as a voice bubble.
    In openclaw-lark/src/messaging/outbound/media.js, the uploadAndSendMediaLark function routes .opus / .ogg files to sendAudioLark (which uses msg_type: "audio"), while all other extensions fall through to sendFileLark (msg_type: "file").

  2. The MiniMax TTS provider always outputs mp3.
    In dist/speech-provider-CNa1803L.js, the MiniMax speech provider's synthesize() method hardcodes:

    outputFormat: "mp3",
    fileExtension: ".mp3",
    voiceCompatible: false

    This means it never produces .opus audio and never signals that the result is voice-compatible, so Feishu never receives it as a native voice bubble.

  3. OpenClaw core already supports this path for Feishu.
    The speech-core runtime in dist/extensions/speech-core/runtime-api.js already has supportsNativeVoiceNoteTts("feishu") === true, and the Feishu extension already has the correct routing logic — it just needs .opus input.

  4. MiniMax API itself is flexible on format.
    The MiniMax T2A HTTP API accepts an audio_setting.format parameter. While the default is mp3, the API also supports pcm, flac, and wav. An Opus output would satisfy Feishu's requirement.

Proposed solution

Option A — Fix in MiniMax speech provider (recommended)

In speech-provider-CNa1803L.js, when req.target === "voice-note" and the channel supports native voice notes (e.g. Feishu):

  1. Ask MiniMax API for mp3 as normal
  2. Transcode to .opus using ffmpeg (already available in the environment)
  3. Return:
    outputFormat: "opus",
    fileExtension: ".opus",
    voiceCompatible: true
  4. Fall back to mp3 output if the transcode fails

This keeps MiniMax working for non-voice-note targets while enabling native Feishu voice bubbles without requiring a provider swap.

Option B — Add configurable responseFormat / outputFormat to MiniMax provider

Allow users to configure messages.tts.providers.minimax.responseFormat to opus (or another supported format). The provider would pass this through to the MiniMax API and set voiceCompatible: true accordingly.

Workaround

None that are user-friendly — currently requires hot-patching the dist file or switching TTS providers.

Environment

  • OpenClaw: 2026.4.21
  • Feishu plugin: 2026.4.8
  • MiniMax TTS: speech-2.8-hd
  • Platform: macOS

extent analysis

TL;DR

To fix the issue of Feishu receiving voice messages as .mp3 file attachments instead of native voice bubbles, modify the MiniMax TTS provider to output .opus files or add a configurable responseFormat to allow users to select the output format.

Guidance

  • Modify the synthesize() method in speech-provider-CNa1803L.js to output .opus files when the target channel is Feishu and supports native voice notes.
  • Use ffmpeg to transcode the output from mp3 to .opus if the MiniMax API does not support Opus output directly.
  • Consider adding a configurable responseFormat to the MiniMax TTS provider to allow users to select the output format.
  • Verify that the modified provider correctly sends voice messages as native Feishu voice bubbles by checking the msg_type field in the sent messages.

Example

// Modified synthesize() method
if (req.target === "voice-note" && supportsNativeVoiceNoteTts("feishu")) {
  // Ask MiniMax API for mp3 as normal
  const mp3Output = await miniMaxApi.synthesize(text, "mp3");
  
  // Transcode to.opus using ffmpeg
  const opusOutput = await transcodeToOpus(mp3Output);
  
  // Return.opus output
  return {
    outputFormat: "opus",
    fileExtension: ".opus",
    voiceCompatible: true,
    audio: opusOutput
  };
}

Notes

The proposed solution requires modifying the MiniMax TTS provider, which may not be feasible or desirable in all cases. Adding a configurable responseFormat can provide more flexibility for users.

Recommendation

Apply the workaround by modifying the MiniMax TTS provider to output .opus files, as this is the most direct solution to the issue. This approach ensures that Feishu receives voice messages as native voice bubbles without requiring significant changes to the underlying infrastructure.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Feishu should receive the TTS audio as a playable voice bubble (msg_type: "audio"), not as a generic file attachment.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Feishu] TTS voice bubble: MiniMax provider does not support voice-compatible output [1 participants]