openclaw - 💡(How to fix) Fix [Bug] media-understanding CLI audio transcription fails: {input} placeholder not replaced [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#72760Fetched 2026-04-28 06:32:26
View on GitHub
Comments
2
Participants
2
Timeline
5
Reactions
0
Timeline (top)
commented ×2closed ×1mentioned ×1subscribed ×1

When receiving voice messages (audio/ogg) via Telegram, OpenClaw's media-understanding module attempts automatic transcription but consistently fails with reason=Command failed. The CLI command itself works fine when executed manually, suggesting the issue is in how OpenClaw invokes or passes arguments to the configured transcription command.

Error Message

  • No additional error details (stderr/exit code) are logged by OpenClaw Error opening input file {input}. Error opening input files: No such file or directory

Feature Request: Better Error Reporting

Root Cause

Root Cause Identified

Code Example

{
  "audio": {
    "enabled": true,
    "timeoutSeconds": 60,
    "language": "zh",
    "models": [
      {
        "type": "cli",
        "command": "/Users/kk/.local/bin/transcribe",
        "args": ["{input}"]
      }
    ],
    "echoTranscript": true,
    "echoFormat": "🎙️ \"{transcript}\""
  }
}

---

#!/bin/bash
/opt/homebrew/bin/python3 -c "
import sys, mlx_whisper
result = mlx_whisper.transcribe(sys.argv[1], path_or_hf_repo='mlx-community/whisper-tiny')
print(result['text'].strip())
" "$1"

---

[media-understanding] audio: failed (0/1) reason=Command failed

---

/Users/kk/.local/bin/transcribe /path/to/file.ogg
# => outputs transcribed text successfully

---

Error opening input file {input}.
Error opening input files: No such file or directory
RAW_BUFFERClick to expand / collapse

Bug Report: media-understanding audio transcription fails with "Command failed"

Summary

When receiving voice messages (audio/ogg) via Telegram, OpenClaw's media-understanding module attempts automatic transcription but consistently fails with reason=Command failed. The CLI command itself works fine when executed manually, suggesting the issue is in how OpenClaw invokes or passes arguments to the configured transcription command.

Environment

  • OpenClaw Version: 2026.4.24 (cbcfdf6)
  • OS: macOS (Darwin 25.3.0, arm64)
  • Channel: Telegram
  • Python: /opt/homebrew/bin/python3 (Homebrew)
  • Transcription Tool: mlx_whisper via custom CLI script

Configuration

~/.openclaw/openclaw.json:

{
  "audio": {
    "enabled": true,
    "timeoutSeconds": 60,
    "language": "zh",
    "models": [
      {
        "type": "cli",
        "command": "/Users/kk/.local/bin/transcribe",
        "args": ["{input}"]
      }
    ],
    "echoTranscript": true,
    "echoFormat": "🎙️ \"{transcript}\""
  }
}

Custom transcribe script (/Users/kk/.local/bin/transcribe):

#!/bin/bash
/opt/homebrew/bin/python3 -c "
import sys, mlx_whisper
result = mlx_whisper.transcribe(sys.argv[1], path_or_hf_repo='mlx-community/whisper-tiny')
print(result['text'].strip())
" "$1"

Steps to Reproduce

  1. Configure a CLI-based audio transcription model in OpenClaw
  2. Send a voice message via Telegram
  3. Observe that transcription fails

Expected Behavior

Voice message should be automatically transcribed and the text appended to the message context.

Actual Behavior

Gateway log shows:

[media-understanding] audio: failed (0/1) reason=Command failed

The transcription script works perfectly when executed manually with the exact same OGG file:

/Users/kk/.local/bin/transcribe /path/to/file.ogg
# => outputs transcribed text successfully

Additional Context

  • This failure has been occurring consistently since at least 2026-04-19
  • Manual execution of the same CLI command with the same input file works
  • No additional error details (stderr/exit code) are logged by OpenClaw
  • The {input} placeholder should resolve to the full path of the downloaded audio file

Root Cause Identified

The {input} placeholder is NOT being substituted with the actual file path.

When OpenClaw invokes the configured CLI command, it passes the literal string {input} as $1 instead of the downloaded audio file path.

Debug log from the transcription script:

Error opening input file {input}.
Error opening input files: No such file or directory

This confirms that sys.argv[1] received the string "{input}" rather than the actual file path.

Suspected Cause

OpenClaw's CLI invocation logic for audio transcription is not performing placeholder substitution for {input} before spawning the subprocess.

Feature Request: Better Error Reporting

It would be helpful if OpenClaw logged:

  • The exact command being executed (after placeholder substitution)
  • The exit code of the failed command
  • stderr output from the failed command
  • The working directory when the command was executed

Reported by: @kris-fan (via OpenClaw agent)

extent analysis

TL;DR

The most likely fix is to modify OpenClaw's configuration or code to correctly substitute the {input} placeholder with the actual file path before invoking the transcription command.

Guidance

  • Verify that the input file path is being correctly downloaded and stored by OpenClaw before attempting transcription.
  • Check the OpenClaw documentation or source code to see if there are any configuration options or environment variables that can be used to enable placeholder substitution for the {input} variable.
  • Consider modifying the custom transcription script to log the value of sys.argv[1] to confirm that the correct file path is being passed.
  • If possible, update the OpenClaw configuration to log more detailed error information, such as the exact command being executed and the exit code of the failed command.

Example

No code snippet is provided as the issue seems to be related to the configuration or invocation of the transcription command rather than the script itself.

Notes

The root cause of the issue appears to be the incorrect substitution of the {input} placeholder, but without access to the OpenClaw source code or documentation, it is difficult to provide a more specific solution.

Recommendation

Apply a workaround by modifying the custom transcription script to accept the file path as an environment variable or a command-line argument in a different format, if possible. This would allow for a temporary fix until the underlying issue with OpenClaw's placeholder substitution can be resolved.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug] media-understanding CLI audio transcription fails: {input} placeholder not replaced [2 comments, 2 participants]