openclaw - 💡(How to fix) Fix WebChat: Restore browser-based voice input (MediaRecorder / getUserMedia) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73634Fetched 2026-04-29 06:17:10
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Author
Timeline (top)
closed ×1commented ×1

The WebChat UI previously supported browser-based voice input using the MediaRecorder / getUserMedia APIs. This functionality appears to have been removed and is no longer available in the WebChat interface.

Root Cause

The WebChat UI previously supported browser-based voice input using the MediaRecorder / getUserMedia APIs. This functionality appears to have been removed and is no longer available in the WebChat interface.

RAW_BUFFERClick to expand / collapse

Type

Feature Request

Summary

The WebChat UI previously supported browser-based voice input using the MediaRecorder / getUserMedia APIs. This functionality appears to have been removed and is no longer available in the WebChat interface.

Current Behavior

  • The WebChat UI (accessed via browser) has no microphone/voice input button
  • Audio transcription is fully configured on the server side (media.audio.enabled: true with Whisper CLI transcription)
  • However, there is no way for browser-based users to send voice input from the WebChat UI

Expected Behavior

  • A microphone button in the WebChat input area
  • Press-and-hold or click-to-record voice input
  • Audio sent to the Gateway for transcription via the configured media.audio provider
  • Transcribed text inserted into the chat (or sent as a message)

Evidence

  • speech-provider modules exist in dist: dist/speech-provider-D4zT9eKR.js, dist/speech-provider-DLDO13bv.js
  • realtime-voice-provider also present: dist/realtime-voice-provider-CHJJ_VX5.js
  • Server-side audio transcription config works (media.audio with Whisper CLI)
  • No MediaRecorder or getUserMedia references found in any current dist files
  • Checked versions: 2026.4.24, 2026.4.25-beta.11, 2026.4.26 — none have browser voice input

Use Case

Users running OpenClaw on a local server (e.g. Raspberry Pi) who access the WebChat via browser would benefit from hands-free voice input. The transcription pipeline already exists — only the browser-side capture is missing.

Suggested Implementation

  1. Add a microphone button to the WebChat input UI
  2. Use navigator.mediaDevices.getUserMedia({ audio: true }) to capture audio
  3. Use MediaRecorder API to record audio chunks
  4. Send recorded audio via the existing chat.send WebSocket method as an audio attachment
  5. The Gateway's media.audio transcription pipeline handles the rest

Environment

  • OpenClaw version: 2026.4.26
  • OS: Linux (Ubuntu/Raspberry Pi)
  • Browser: Chromium/Firefox (any modern browser with MediaRecorder support)
  • Server-side transcription: Whisper CLI

extent analysis

TL;DR

Implementing the suggested MediaRecorder and getUserMedia API functionality to capture audio in the WebChat UI is likely the most straightforward fix.

Guidance

  • Review the speech-provider and realtime-voice-provider modules in the dist directory to understand how they interact with the existing transcription pipeline.
  • Verify that the media.audio.enabled: true configuration is correctly set and functional on the server side with Whisper CLI transcription.
  • Consider adding a feature flag or toggle to enable the microphone button and voice input functionality in the WebChat UI, allowing for easier testing and rollout.
  • Test the implementation across different browsers (e.g., Chromium, Firefox) to ensure compatibility.

Example

// Example of using MediaRecorder API to record audio chunks
navigator.mediaDevices.getUserMedia({ audio: true })
  .then(stream => {
    const mediaRecorder = new MediaRecorder(stream);
    const audioChunks = [];

    mediaRecorder.ondataavailable = event => {
      audioChunks.push(event.data);
    };

    mediaRecorder.onstop = () => {
      const audioBlob = new Blob(audioChunks, { type: 'audio/wav' });
      // Send audioBlob via chat.send WebSocket method as an audio attachment
    };

    mediaRecorder.start();
    // Stop recording after a certain duration or on user input
    setTimeout(() => mediaRecorder.stop(), 5000);
  })
  .catch(error => console.error('Error recording audio:', error));

Notes

The provided example is a basic illustration and may require modifications to fit the specific requirements of the WebChat UI and existing transcription pipeline.

Recommendation

Apply the suggested implementation using MediaRecorder and getUserMedia APIs, as it directly addresses the missing voice input functionality in the WebChat UI.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING