openclaw - 💡(How to fix) Fix [Feature]: Signal channel — pluggable audio preflight (whisper auto-transcribe for voice notes) [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70439Fetched 2026-04-23 07:24:48
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

We were carrying a local patch against dist/monitor-*.js that called transcribeFirstAudio() from the plugin-sdk on inbound Signal audio. Retiring it because minified targets drift across releases (2026.4.21 changed the monitor bundle structure). Prefer an officially supported hook.

Root Cause

We were carrying a local patch against dist/monitor-*.js that called transcribeFirstAudio() from the plugin-sdk on inbound Signal audio. Retiring it because minified targets drift across releases (2026.4.21 changed the monitor bundle structure). Prefer an officially supported hook.

Fix Action

Fix / Workaround

If upstream ships option (1), two issues we hit on a local workaround:

We were carrying a local patch against dist/monitor-*.js that called transcribeFirstAudio() from the plugin-sdk on inbound Signal audio. Retiring it because minified targets drift across releases (2026.4.21 changed the monitor bundle structure). Prefer an officially supported hook.

Code Example

channels:
      signal:
        audio:
          preflight_cmd: "/usr/local/bin/whisper-any.sh {{ MediaPath }}"
RAW_BUFFERClick to expand / collapse

Ask

Expose a pluggable audio-preflight hook for the Signal channel (and ideally a channel-agnostic base) so voice notes can be auto-transcribed before they reach the agent — parity with Telegram and Discord.

Two flavors would work:

  1. Built-in whisper integration, mirroring what Telegram/Discord ship — a config key like channels.signal.audio.transcribe: whisper with provider auth pulled from the same place other channels use.
  2. Config hook for an external converter, e.g.
    channels:
      signal:
        audio:
          preflight_cmd: "/usr/local/bin/whisper-any.sh {{ MediaPath }}"
    The gateway invokes the command with the media path, reads stdout as transcript, and substitutes it for the <media:audio> placeholder before the agent sees the message.

Why

Today Signal inbound voice notes arrive at the agent as <media:audio> placeholder text with no transcript. The agent has no transcribe-tool to call, so it responds blindly. Telegram and Discord auto-transcribe via built-in whisper; Signal doesn't.

Gotchas to document

If upstream ships option (1), two issues we hit on a local workaround:

  • The Signal monitor exposes {{ MediaPath }} (Handlebars-style), not {input} or {path}. Document which template variables are available to the preflight hook.
  • whisper-cli only reads WAV. Inbound Signal audio is typically OGG/Opus or MP4 — the hook needs access to an ffmpeg or similar to convert first. A small bundled wrapper (or docs pointing at one) would save users reinventing it.

Related

  • #48614 [Bug]: Signal voice notes — MediaPath not populated, audio transcription pipeline never triggers
  • #46326 Audio transcription stopped working after update to 2026.3.12/3.13
  • #54695 [Bug] STT transcription not working on Telegram (Groq)
  • #51645 Voice transcription not working in group/topic chats

Context

We were carrying a local patch against dist/monitor-*.js that called transcribeFirstAudio() from the plugin-sdk on inbound Signal audio. Retiring it because minified targets drift across releases (2026.4.21 changed the monitor bundle structure). Prefer an officially supported hook.

Env

extent analysis

TL;DR

Implement a pluggable audio-preflight hook for the Signal channel to enable auto-transcription of voice notes using either built-in whisper integration or an external converter.

Guidance

  • Consider implementing a config hook for an external converter, such as whisper-cli, to transcribe audio files before they reach the agent.
  • Ensure the external converter can handle audio file formats used by Signal, such as OGG/Opus or MP4, which may require additional conversion steps using tools like ffmpeg.
  • Document the available template variables for the preflight hook, such as {{ MediaPath }}, to facilitate customization.
  • Investigate using a bundled wrapper or providing documentation for a wrapper to simplify the conversion process for users.

Example

channels:
  signal:
    audio:
      preflight_cmd: "/usr/local/bin/whisper-any.sh {{ MediaPath }}"

This example illustrates how to configure an external converter using a YAML file.

Notes

The implementation may need to account for differences in audio file formats and the requirements of the external converter. Additionally, the solution should be designed to work with the Signal channel via signal-cli in a multi-tenant self-hosted environment.

Recommendation

Apply a workaround using an external converter, such as whisper-cli, to enable audio transcription for Signal voice notes, as this approach provides a flexible and customizable solution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Feature]: Signal channel — pluggable audio preflight (whisper auto-transcribe for voice notes) [1 participants]