openclaw - 💡(How to fix) Fix [Bug] media-understanding CLI audio transcription fails: {input} placeholder not replaced [2 comments, 2 participants]

krisfanue3-hash · 2026-04-27T10:50:01Z

[openclaw] When receiving voice messages audio/ogg via Telegram, OpenClaw's media-understanding module attempts automatic transcription but consistently fails… When receiving voice messages (audio/ogg) via Telegram, OpenClaw's `media-understanding` module attempts automatic transcription but consistently fails with `reason=Command failed`. The CLI command itself works fine when executed manually, suggesting the issue is in how OpenClaw invokes or passes arguments to the configured transcription command. # Bug Report: media-understanding audio transcription fails with "Command failed" ## Summary When receiving voice messages (audio/ogg) via Telegram, OpenClaw's `media-understanding` module attempts automatic transcription but consistently fails with `reason=Command failed`. The CLI command itself works fine when executed manually, suggesting the issue is in how OpenClaw invokes or passes arguments to the configured transcription command. ## Environment - **OpenClaw Version**: 2026.4.24 (cbcfdf6) - **OS**: macOS (Darwin 25.3.0, arm64) - **Channel**: Telegram - **Python**: /opt/homebrew/bin/python3 (Homebrew) - **Transcription Tool**: mlx_whisper via custom CLI script ## Configuration `~/.openclaw/openclaw.json`: ```json { "audio": { "enabled": true, "timeoutSeconds": 60, "language": "zh", "models": [ { "type": "cli", "command": "/Users/kk/.local/bin/transcribe", "args": ["{input}"] } ], "echoTranscript": true, "echoFormat": "🎙️ \"{transcript}\"" } } ``` Custom transcribe script (`/Users/kk/.local/bin/transcribe`): ```bash #!/bin/bash /opt/homebrew/bin/python3 -c " import sys, mlx_whisper result = mlx_whisper.transcribe(sys.argv[1], path_or_hf_repo='mlx-community/whisper-tiny') print(result['text'].strip()) " "$1" ``` ## Steps to Reproduce 1. Configure a CLI-based audio transcription model in OpenClaw 2. Send a voice message via Telegram 3. Observe that transcription fails ## Expected Behavior Voice message should be automatically transcribed and the text appended to the message context. ## Actual Behavior Gateway log shows: ``` [media-understanding] audio: failed (0/1) reason=Command failed ``` The transcription script works perfectly when executed **manually** with the exact same OGG file: ```bash /Users/kk/.local/bin/transcribe /path/to/file.ogg # => outputs transcribed text successfully ``` ## Additional Context - This failure has been occurring consistently since at least 2026-04-19 - Manual execution of the same CLI command with the same input file works - No additional error details (stderr/exit code) are logged by OpenClaw - The `{input}` placeholder should resolve to the full path of the downloaded audio file ## Root Cause Identified **The `{input}` placeholder is NOT being substituted with the actual file path.** When OpenClaw invokes the configured CLI command, it passes the literal string `{input}` as `$1` instead of the downloaded audio file path. Debug log from the transcription script: ``` Error opening input file {input}. Error opening input files: No such file or directory ``` This confirms that `sys.argv[1]` received the string `"{input}"` rather than the actual file path. ## Suspected Cause OpenClaw's CLI invocation logic for audio transcription is not performing placeholder substitution for `{input}` before spawning the subprocess. ## Feature Request: Better Error Reporting It would be helpful if OpenClaw logged: - The exact command being executed (after placeholder substitution) - The exit code of the failed command - stderr output from the failed command - The working directory when the command was executed --- *Reported by: @kris-fan (via OpenClaw agent)*

openclaw2026-04-27 10:50:01

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#72760•Fetched 2026-04-28 06:32:26

View on GitHub

Comments

Participants

Timeline

Reactions

Author

krisfanue3-hash

Participants

krisfanue3-hash

steipete

Timeline (top)

commented ×2closed ×1mentioned ×1subscribed ×1

When receiving voice messages (audio/ogg) via Telegram, OpenClaw's media-understanding module attempts automatic transcription but consistently fails with reason=Command failed. The CLI command itself works fine when executed manually, suggesting the issue is in how OpenClaw invokes or passes arguments to the configured transcription command.

Error Message

No additional error details (stderr/exit code) are logged by OpenClaw Error opening input file {input}. Error opening input files: No such file or directory

Feature Request: Better Error Reporting

Root Cause

Root Cause Identified

Code Example

{
  "audio": {
    "enabled": true,
    "timeoutSeconds": 60,
    "language": "zh",
    "models": [
      {
        "type": "cli",
        "command": "/Users/kk/.local/bin/transcribe",
        "args": ["{input}"]
      }
    ],
    "echoTranscript": true,
    "echoFormat": "🎙️ \"{transcript}\""
  }
}

---

#!/bin/bash
/opt/homebrew/bin/python3 -c "
import sys, mlx_whisper
result = mlx_whisper.transcribe(sys.argv[1], path_or_hf_repo='mlx-community/whisper-tiny')
print(result['text'].strip())
" "$1"

---

[media-understanding] audio: failed (0/1) reason=Command failed

---

/Users/kk/.local/bin/transcribe /path/to/file.ogg
# => outputs transcribed text successfully

---

Error opening input file {input}.
Error opening input files: No such file or directory

RAW_BUFFERClick to expand / collapse

Bug Report: media-understanding audio transcription fails with "Command failed"

Summary

Environment

OpenClaw Version: 2026.4.24 (cbcfdf6)
OS: macOS (Darwin 25.3.0, arm64)
Channel: Telegram
Python: /opt/homebrew/bin/python3 (Homebrew)
Transcription Tool: mlx_whisper via custom CLI script

Configuration

~/.openclaw/openclaw.json:

{
  "audio": {
    "enabled": true,
    "timeoutSeconds": 60,
    "language": "zh",
    "models": [
      {
        "type": "cli",
        "command": "/Users/kk/.local/bin/transcribe",
        "args": ["{input}"]
      }
    ],
    "echoTranscript": true,
    "echoFormat": "🎙️ \"{transcript}\""
  }
}

Custom transcribe script (/Users/kk/.local/bin/transcribe):

#!/bin/bash
/opt/homebrew/bin/python3 -c "
import sys, mlx_whisper
result = mlx_whisper.transcribe(sys.argv[1], path_or_hf_repo='mlx-community/whisper-tiny')
print(result['text'].strip())
" "$1"

Steps to Reproduce

Configure a CLI-based audio transcription model in OpenClaw
Send a voice message via Telegram
Observe that transcription fails

Expected Behavior

Voice message should be automatically transcribed and the text appended to the message context.

Actual Behavior

Gateway log shows:

[media-understanding] audio: failed (0/1) reason=Command failed

The transcription script works perfectly when executed manually with the exact same OGG file:

/Users/kk/.local/bin/transcribe /path/to/file.ogg
# => outputs transcribed text successfully

Additional Context

This failure has been occurring consistently since at least 2026-04-19
Manual execution of the same CLI command with the same input file works
No additional error details (stderr/exit code) are logged by OpenClaw
The {input} placeholder should resolve to the full path of the downloaded audio file

Root Cause Identified

The {input} placeholder is NOT being substituted with the actual file path.

When OpenClaw invokes the configured CLI command, it passes the literal string {input} as $1 instead of the downloaded audio file path.

Debug log from the transcription script:

Error opening input file {input}.
Error opening input files: No such file or directory

This confirms that sys.argv[1] received the string "{input}" rather than the actual file path.

Suspected Cause

OpenClaw's CLI invocation logic for audio transcription is not performing placeholder substitution for {input} before spawning the subprocess.

Feature Request: Better Error Reporting

It would be helpful if OpenClaw logged:

The exact command being executed (after placeholder substitution)
The exit code of the failed command
stderr output from the failed command
The working directory when the command was executed

Reported by: @kris-fan (via OpenClaw agent)

extent analysis

TL;DR

The most likely fix is to modify OpenClaw's configuration or code to correctly substitute the {input} placeholder with the actual file path before invoking the transcription command.

Guidance

Verify that the input file path is being correctly downloaded and stored by OpenClaw before attempting transcription.
Check the OpenClaw documentation or source code to see if there are any configuration options or environment variables that can be used to enable placeholder substitution for the {input} variable.
Consider modifying the custom transcription script to log the value of sys.argv[1] to confirm that the correct file path is being passed.
If possible, update the OpenClaw configuration to log more detailed error information, such as the exact command being executed and the exit code of the failed command.

Example

No code snippet is provided as the issue seems to be related to the configuration or invocation of the transcription command rather than the script itself.

Notes

The root cause of the issue appears to be the incorrect substitution of the {input} placeholder, but without access to the OpenClaw source code or documentation, it is difficult to provide a more specific solution.

Recommendation

Apply a workaround by modifying the custom transcription script to accept the file path as an environment variable or a command-line argument in a different format, if possible. This would allow for a temporary fix until the underlying issue with OpenClaw's placeholder substitution can be resolved.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#training loop #device allocation #model download #tokenizer error #prompt formatting

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug] media-understanding CLI audio transcription fails: {input} placeholder not replaced [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Feature Request: Better Error Reporting

Root Cause

Root Cause Identified

Code Example

Bug Report: media-understanding audio transcription fails with "Command failed"

Summary

Environment

Configuration

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Context

Root Cause Identified

Suspected Cause

Feature Request: Better Error Reporting

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug] media-understanding CLI audio transcription fails: {input} placeholder not replaced [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Feature Request: Better Error Reporting

Root Cause

Root Cause Identified

Code Example

Bug Report: media-understanding audio transcription fails with "Command failed"

Summary

Environment

Configuration

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Context

Root Cause Identified

Suspected Cause

Feature Request: Better Error Reporting

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING