openclaw - 💡(How to fix) Fix [Feature]: Webchat: TTS MEDIA: audio attachments render as download cards — no inline audio player [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73826Fetched 2026-04-29 06:14:36
View on GitHub
Comments
1
Participants
2
Timeline
9
Reactions
0
Timeline (top)
mentioned ×3subscribed ×3closed ×1commented ×1

TTS is configured and working (messages.tts.auto: "always"). Audio files are generated correctly by any TTS provider (ElevenLabs, Chatterbox, OpenAI-compatible, etc.) and delivered as MEDIA: attachments. In webchat, they arrive as generic download cards — no inline audio player, no auto-play. This breaks auto: "always" for the primary web-based Control UI.

Root Cause

TTS is configured and working (messages.tts.auto: "always"). Audio files are generated correctly by any TTS provider (ElevenLabs, Chatterbox, OpenAI-compatible, etc.) and delivered as MEDIA: attachments. In webchat, they arrive as generic download cards — no inline audio player, no auto-play. This breaks auto: "always" for the primary web-based Control UI.

Fix Action

Fix / Workaround

Even after fixing the kind mapping (verified via local bundle patch), the attachment card React component renders all non-image attachments as identical download cards. The icon dispatch correctly shows a mic for kind "audio", but the card content is still a bare download link. There is no HTMLAudioElement, no <audio controls>, and no audio.play() anywhere in the attachment card render path.

Steps to Reproduce Configure any TTS provider with messages.tts.auto: "always" Send a message in webchat Reply comes — audio file is generated correctly (verified on disk, valid WAV/MP3) In webchat: no audio player, no auto-play, no play button — just a download card Evidence TTS endpoint returns valid audio: curl POST /v1/audio/speech → 200, RIFF WAV 16-bit mono 24000Hz Rw map includes audio MIME types (verified in source) Vw function returns kind: "document" for .mp3 attachments (verified in source) MC function has no audio/* mapping (verified in source) Attachment card icon dispatch: kind === "audio" ? micIcon : paperclip — proves audio kind is recognized when set Local patch changing kind to "audio" confirmed: icon changes to mic, but no player appears — proving Gap 2

RAW_BUFFERClick to expand / collapse

Summary

TTS is configured and working (messages.tts.auto: "always"). Audio files are generated correctly by any TTS provider (ElevenLabs, Chatterbox, OpenAI-compatible, etc.) and delivered as MEDIA: attachments. In webchat, they arrive as generic download cards — no inline audio player, no auto-play. This breaks auto: "always" for the primary web-based Control UI.

Problem to solve

Gap 1 — MIME classification falls back to "document" for audio.

In the bundle (dist/control-ui/assets/index-*.js), function Vw resolves attachment kind from MIME type:

Copy function Vw(e) { let t = Bw(e); // correctly resolves .mp3 → "audio/mpeg" via Rw map return { kind: MC(t) ?? "document", // MC has no audio/* entry → always "document" mimeType: t, label: filename } } The Rw map correctly includes audio types (mp3: "audio/mpeg", wav: "audio/wav", opus: "audio/opus", etc.), and Bw resolves them. But MC has no audio/* case. Audio falls through to the "document" default, rendering as a generic download card with a paperclip icon.

Fix: MC(t) ?? (t?.startsWith("audio/") ? "audio" : "document") — one line.

Gap 2 — No audio player component exists for kind "audio".

Even after fixing the kind mapping (verified via local bundle patch), the attachment card React component renders all non-image attachments as identical download cards. The icon dispatch correctly shows a mic for kind "audio", but the card content is still a bare download link. There is no HTMLAudioElement, no <audio controls>, and no audio.play() anywhere in the attachment card render path.

The renderer has audio awareness elsewhere (Talk mode WebRTC, audioAsVoice flags for channels) — it just needs an audio player component wired into the chat bubble attachment card.

Proposed solution

In Vw: before the MC(t) ?? "document" fallback, return "audio" when t?.startsWith("audio/") In the attachment card component: when kind === "audio", render <audio controls autoplay> with fallback to controls-only if browser autoplay policy blocks Gateway may need to expose a media serving route for local TTS audio paths

Alternatives considered

No response

Impact

Affected users/systems/channels: All webchat Control UI users with TTS enabled. Affects every TTS provider — ElevenLabs, native OpenAI, Chatterbox, OpenRouter, Microsoft Edge, self-hosted OpenAI-compatible endpoints, and any locally configured speech provider. Does NOT affect channel-based TTS delivery (Telegram, Discord, WhatsApp) where native voice notes work correctly.

Severity: Blocks a documented workflow. messages.tts.auto: "always" is an advertised config option, and the docs state TTS "works anywhere OpenClaw can send audio." In practice, it fails silently on the most common local surface (webchat at localhost). Users who primarily interact through the Control UI have no way to hear TTS output without manually opening downloaded files. For accessibility use cases (assistive audio output), this is a hard blocker.

Frequency: Always. 100% reproducible across all TTS providers on webchat. Not intermittent, not provider-specific, not config-dependent. Every auto-TTS reply on webchat produces a download card instead of an audio player.

Consequence:

Users must manually locate, download, and open audio files to hear TTS — defeating the purpose of auto: "always" Self-hosted TTS setups (Chatterbox, Piper, Coqui, XTTS) have no viable audio surface on the default Control UI, forcing users to add external channels solely for voice output Four duplicate issues have been filed across multiple OpenClaw releases with no resolution, indicating this is a recurring pain point that new users consistently hit The auto: "always" setting appears broken to end users, creating the perception that TTS is non-functional when in reality the audio pipeline is healthy and the gap is purely frontend Extra manual work: every TTS interaction requires file-system navigation and manual playback instead of a single click/press

Evidence/examples

Steps to Reproduce Configure any TTS provider with messages.tts.auto: "always" Send a message in webchat Reply comes — audio file is generated correctly (verified on disk, valid WAV/MP3) In webchat: no audio player, no auto-play, no play button — just a download card Evidence TTS endpoint returns valid audio: curl POST /v1/audio/speech → 200, RIFF WAV 16-bit mono 24000Hz Rw map includes audio MIME types (verified in source) Vw function returns kind: "document" for .mp3 attachments (verified in source) MC function has no audio/* mapping (verified in source) Attachment card icon dispatch: kind === "audio" ? micIcon : paperclip — proves audio kind is recognized when set Local patch changing kind to "audio" confirmed: icon changes to mic, but no player appears — proving Gap 2

Additional information

#65859 — "Add audio upload and playback support in webchat" (open) #57296 — "WebChat TTS reports success but no audible playback (Telegram works)" (open) #57297 — "WebChat TTS: add explicit routing/playback logs" (open) #45508 — "Self-hosted STT/TTS provider support in webchat" (open) #10325 — Same issue, auto-closed as stale

extent analysis

TL;DR

The most likely fix is to update the Vw function to correctly map audio MIME types to the "audio" kind and add an audio player component to the attachment card.

Guidance

  • Update the Vw function to return "audio" when the MIME type starts with "audio/" to fix the kind mapping issue.
  • Add an audio player component to the attachment card that renders an <audio> element with controls and autoplay when the kind is "audio".
  • Verify that the audio player component is correctly rendered and functional by checking the attachment card in the webchat UI.
  • Consider exposing a media serving route for local TTS audio paths to ensure seamless playback.

Example

function Vw(e) {
  let t = Bw(e);
  return {
    kind: t?.startsWith("audio/") ? "audio" : "document",
    mimeType: t,
    label: filename
  }
}

Notes

The provided solution focuses on fixing the kind mapping issue and adding an audio player component. However, additional work may be required to ensure seamless playback, such as exposing a media serving route for local TTS audio paths.

Recommendation

Apply the proposed solution to update the Vw function and add an audio player component to the attachment card, as it addresses the identified gaps and provides a clear path to resolving the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Feature]: Webchat: TTS MEDIA: audio attachments render as download cards — no inline audio player [1 comments, 2 participants]