openclaw - ✅(Solved) Fix Audio auto transcription can prefer local Whisper over API provider and break Groq multipart uploads [1 pull requests, 1 participants]

openclaw2026-04-18 23:47:21

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#68727•Fetched 2026-04-19 15:08:15

View on GitHub

Comments

Participants

Timeline

Reactions

Author

dyz2102

Participants

dyz2102

Timeline (top)

cross-referenced ×1

Telegram/audio transcription can unexpectedly use a local Whisper CLI even when an API provider is configured/available, and Groq OpenAI-compatible audio transcription can fail when OpenClaw passes a proxy-wrapped fetch into the audio provider.

Error Message

Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data","type":"invalid_request_error"}}

Root Cause

Fix Action

Fix / Workaround

I have a small patch ready with focused Vitest coverage.

PR fix notes

PR #68733: Prefer API audio providers before local fallback

Repository: openclaw/openclaw
Author: dyz2102
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/68733

Description (problem / solution / changelog)

Summary

Fixes #68727.

This changes audio media-understanding fallback behavior so API-provider audio transcription wins before local CLI auto-detection, and avoids passing a pre-wrapped proxy fetch into audio providers. Audio providers already route requests through the shared provider HTTP helpers, which handle proxy env vars and NO_PROXY at the request URL boundary.

Why

A gateway with GROQ_API_KEY/Groq audio configured and a local /opt/homebrew/bin/whisper could still take the local audio path in auto mode. When the Groq provider path was selected, passing resolveProxyFetchFromEnv() into OpenAI-compatible audio transcription could break multipart uploads with Groq returning request Content-Type isn't multipart/form-data.

Changes

Prefer resolveKeyEntry() before local audio CLI fallback in audio auto mode.
Do not pass a direct proxy-wrapped fetchFn into audio providers; leave audio proxy handling to the shared provider HTTP layer.
Add regression coverage for provider-key priority when a local whisper executable exists.
Update proxy passthrough coverage for the audio path.

Validation

pnpm exec vitest run src/media-understanding/runner.auto-audio.test.ts src/media-understanding/runner.proxy.test.ts
pre-commit pnpm check passed locally during commit.

Changed files

src/media-understanding/runner.auto-audio.test.ts (modified, +33/-1)
src/media-understanding/runner.entries.ts (modified, +5/-3)
src/media-understanding/runner.proxy.test.ts (modified, +3/-2)
src/media-understanding/runner.ts (modified, +4/-0)

Code Example

Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data","type":"invalid_request_error"}}

RAW_BUFFERClick to expand / collapse

Summary

Local reproduction

Environment:

macOS gateway
tools.media.audio.models configured with provider: "groq", model: "whisper-large-v3"
GROQ_API_KEY set
HTTP_PROXY / HTTPS_PROXY set
/opt/homebrew/bin/whisper present

Observed behavior:

In the audio auto path, resolveAutoEntries() checks local audio CLIs before resolveKeyEntry(), so an installed local whisper/whisper-cli can win over API-provider transcription.
After forcing the provider path to Groq, the request failed with:

Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data","type":"invalid_request_error"}}

Raw curl to https://api.groq.com/openai/v1/audio/transcriptions with the same key/audio file succeeded, and calling the OpenAI-compatible audio provider directly without the proxy-wrapped fetchFn also succeeded.

Expected behavior

If an audio API provider is configured or available via provider key, OpenClaw should prefer that provider before auto-detecting local CLIs. For OpenAI-compatible audio uploads, multipart form requests should not be broken by an EnvHttpProxyAgent fetch wrapper; the shared provider HTTP helper already has URL-aware proxy/NO_PROXY handling.

Proposed fix

In audio resolveAutoEntries(), try resolveKeyEntry() before local CLI fallback.
For audio provider entries, do not pass resolveProxyFetchFromEnv() directly; let provider HTTP helpers handle proxy env and NO_PROXY at request time.
Add regression coverage for provider-key priority when a local whisper binary exists, and for audio proxy fetch behavior.

I have a small patch ready with focused Vitest coverage.

extent analysis

TL;DR

Modify the resolveAutoEntries() function to prioritize resolveKeyEntry() over local CLI fallback and update the audio provider entries to handle proxy env and NO_PROXY at request time.

Guidance

Update the resolveAutoEntries() function to try resolveKeyEntry() before falling back to local CLI detection to ensure API providers are preferred.
Modify the audio provider entries to not pass resolveProxyFetchFromEnv() directly, allowing provider HTTP helpers to handle proxy env and NO_PROXY at request time.
Verify the fix by testing with a local whisper binary installed and an API provider configured, ensuring the API provider is used instead of the local CLI.
Test the audio provider with a proxy-wrapped fetch to ensure multipart form requests are not broken.

Example

No code snippet is provided as the issue does not contain sufficient code context.

Notes

The proposed fix assumes that the resolveKeyEntry() function is correctly implemented and that the provider HTTP helpers can handle proxy env and NO_PROXY correctly.

Recommendation

Apply the proposed fix to update the resolveAutoEntries() function and audio provider entries, as it addresses the root cause of the issue and ensures API providers are preferred over local CLIs.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #agent setup #task chaining #parallel task #integration issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Audio auto transcription can prefer local Whisper over API provider and break Groq multipart uploads [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #68733: Prefer API audio providers before local fallback

Description (problem / solution / changelog)

Summary

Why

Changes

Validation

Changed files

Code Example

Summary

Local reproduction

Expected behavior

Proposed fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING