openclaw - ✅(Solved) Fix Silent/empty voice notes should inject a clear message instead of appearing as an API error [3 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#48944Fetched 2026-04-08 00:50:42
View on GitHub
Comments
2
Participants
3
Timeline
6
Reactions
0
Author
Timeline (top)
cross-referenced ×3commented ×2referenced ×1

Error Message

This way the agent can respond appropriately ("Looks like the voice note was silent — want to try again?") instead of fabricating an error explanation. 4. Agent receives unclear context and responds with a misleading API error message

Fix Action

Fixed

PR fix notes

PR #49131: fix: add placeholder transcript for silent voice notes

Description (problem / solution / changelog)

Summary

  • inject a synthetic placeholder transcript when tiny audio is skipped as tooSmall
  • preserve the existing transcript/body formatting flow by emitting a synthetic audio.transcription output
  • add coverage for URL-only tiny audio so the agent gets deterministic context instead of a blank transcript gap

Problem

Tiny or silent voice notes below MIN_AUDIO_FILE_BYTES are intentionally skipped before provider transcription. That leaves the downstream agent context without a transcript signal, which can lead to misleading explanations instead of a simple “that voice note was silent/empty” response.

What changed

When audio is skipped with the tooSmall reason, applyMediaUnderstanding() now backfills a synthetic transcript:

[Voice note was empty or contained only silence — no speech detected]

This keeps the existing transcript formatting and body update behavior intact without calling the provider for tiny files.

Validation

  • pnpm build
  • pnpm check
  • Targeted tests:
    • src/media-understanding/runner.skip-tiny-audio.test.ts
    • src/media-understanding/apply.test.ts
    • src/media-understanding/format.test.ts

Notes

  • I could not run codex review --base origin/main in this environment because the codex CLI is not installed here.
  • pnpm test is not fully green on clean upstream main in this environment. I isolated the existing failures to:
    • src/media-understanding/apply.echo-transcript.test.ts
    • src/memory/embeddings-gemini.test.ts These reproduced unchanged against a clean upstream/main worktree before/independent of this patch.

Closes #48944

Changed files

  • src/media-understanding/apply.test.ts (modified, +61/-3)
  • src/media-understanding/apply.ts (modified, +87/-1)
  • src/media-understanding/format.test.ts (modified, +25/-0)
RAW_BUFFERClick to expand / collapse

Problem

When a user sends a silent or near-empty voice note, the audio file is under 1024 bytes and gets skipped by the transcription pipeline (as documented: "Tiny/empty audio files below 1024 bytes are skipped before provider/CLI transcription").

However, the agent doesn't receive a clear indication that the voice note was empty/silent. Instead, the missing or broken transcript context causes the agent to hallucinate an explanation — typically telling the user there's an API quota issue, which is inaccurate and confusing.

Expected behavior

When a voice note is skipped due to being under the 1024-byte threshold, OpenClaw should inject a clear message to the agent, e.g.:

  • [Voice note was empty or contained only silence]
  • Or set {{Transcript}} to a descriptive placeholder

This way the agent can respond appropriately ("Looks like the voice note was silent — want to try again?") instead of fabricating an error explanation.

Steps to reproduce

  1. Send a voice note with no speech (just silence or very short accidental tap)
  2. The audio file will be under 1024 bytes
  3. Transcription is skipped
  4. Agent receives unclear context and responds with a misleading API error message

Environment

  • OpenClaw on Raspberry Pi 5
  • Local Whisper model for transcription
  • Affects all agent configurations

extent analysis

Fix Plan

To address this issue, we need to modify the transcription pipeline to inject a clear message when a voice note is skipped due to being under the 1024-byte threshold.

Step-by-Step Solution

  1. Check audio file size: Before sending the audio file to the transcription pipeline, check its size.
  2. Inject clear message: If the audio file size is under 1024 bytes, inject a clear message to the agent, e.g., [Voice note was empty or contained only silence].
  3. Update agent response: Update the agent to respond appropriately when receiving the clear message.

Example Code

import os

# Check audio file size
def check_audio_file_size(audio_file_path):
    file_size = os.path.getsize(audio_file_path)
    if file_size < 1024:
        return "[Voice note was empty or contained only silence]"
    else:
        return None

# Inject clear message
def inject_clear_message(transcript_context, audio_file_path):
    clear_message = check_audio_file_size(audio_file_path)
    if clear_message:
        transcript_context["Transcript"] = clear_message
    return transcript_context

# Example usage
audio_file_path = "path/to/audio/file.wav"
transcript_context = {}
transcript_context = inject_clear_message(transcript_context, audio_file_path)
print(transcript_context)

Verification

To verify that the fix worked, send a voice note with no speech and check the agent's response. The agent should respond with a message indicating that the voice note was silent, instead of fabricating an error explanation.

Extra Tips

  • Make sure to update the agent's response logic to handle the new clear message.
  • Consider adding additional logging or monitoring to track instances where the clear message is injected.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When a voice note is skipped due to being under the 1024-byte threshold, OpenClaw should inject a clear message to the agent, e.g.:

  • [Voice note was empty or contained only silence]
  • Or set {{Transcript}} to a descriptive placeholder

This way the agent can respond appropriately ("Looks like the voice note was silent — want to try again?") instead of fabricating an error explanation.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Silent/empty voice notes should inject a clear message instead of appearing as an API error [3 pull requests, 2 comments, 3 participants]