openclaw - 💡(How to fix) Fix feat(typing): show typing indicator immediately on voice message receipt, before STT transcription [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#58887Fetched 2026-04-08 02:31:27
View on GitHub
Comments
1
Participants
1
Timeline
0
Reactions
0
Participants
  • Channel: Telegram (likely affects WhatsApp, WeChat, and other channels with voice support)
  • typingMode: "instant" is already configured but has no effect on voice messages because the agent loop doesn't start until after transcription
  • Related: #39052 (parallelize audio preflight), #39075 (optimize Telegram pipeline)

Root Cause

  • Channel: Telegram (likely affects WhatsApp, WeChat, and other channels with voice support)
  • typingMode: "instant" is already configured but has no effect on voice messages because the agent loop doesn't start until after transcription
  • Related: #39052 (parallelize audio preflight), #39075 (optimize Telegram pipeline)

Fix Action

Fix / Workaround

In the inbound voice message handler, send sendChatAction(typing) immediately upon receiving the message, before dispatching to the STT pipeline. This is purely a UX change with no functional impact.

RAW_BUFFERClick to expand / collapse

Problem

When a user sends a voice message, the typing indicator only appears after STT transcription completes — not when the message is received. This creates a silent wait of 3–6 seconds before any feedback is shown:

StepTime
Telegram polling + message received~1s
Audio file download from Telegram CDN~2–3s
STT transcription (e.g. OpenAI gpt-4o-mini-transcribe)~1–2s
Typing indicator finally appears~4–6s after send

For text messages, typing shows in ~1s. The voice UX feels broken by comparison.

Expected Behavior

Typing indicator should fire as soon as the voice message is received, before the transcription pipeline starts — similar to how typingMode: "instant" works for text messages.

Suggested Implementation

In the inbound voice message handler, send sendChatAction(typing) immediately upon receiving the message, before dispatching to the STT pipeline. This is purely a UX change with no functional impact.

Context

  • Channel: Telegram (likely affects WhatsApp, WeChat, and other channels with voice support)
  • typingMode: "instant" is already configured but has no effect on voice messages because the agent loop doesn't start until after transcription
  • Related: #39052 (parallelize audio preflight), #39075 (optimize Telegram pipeline)

extent analysis

TL;DR

Send the typing indicator immediately upon receiving the voice message, before starting the STT transcription pipeline.

Guidance

  • Modify the inbound voice message handler to send sendChatAction(typing) as soon as the message is received, without waiting for transcription to complete.
  • Verify that the typing indicator appears promptly after sending a voice message, ideally within 1-2 seconds.
  • Consider reviewing related issues #39052 and #39075 for potential optimizations to the audio preflight and Telegram pipeline.
  • Test the change on different channels, such as WhatsApp and WeChat, to ensure the fix applies broadly.

Example

// Inbound voice message handler
function handleVoiceMessage(message) {
  // Send typing indicator immediately
  sendChatAction('typing');
  
  // Dispatch to STT pipeline
  transcribeVoiceMessage(message);
}

Notes

This fix assumes that the sendChatAction function is available and correctly implemented for the Telegram channel. Additionally, the effectiveness of this change may depend on the specific STT transcription pipeline and its performance characteristics.

Recommendation

Apply the suggested implementation workaround, as it addresses the UX issue without requiring functional changes to the STT pipeline or other components. This approach should provide a more responsive typing indicator for voice messages, aligning with the expected behavior for text messages.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING