openclaw - ✅(Solved) Fix Audio transcription fails on 2026.4.5: SSRF guard corrupts multipart FormData for Whisper API [1 pull requests, 1 comments, 2 participants]

openclaw2026-04-06 23:05:46

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#62173•Fetched 2026-04-08 03:08:04

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ketani23

Participants

ketani23

nonlxyzsg-dev

Timeline (top)

subscribed ×3mentioned ×2commented ×1cross-referenced ×1

Root Cause

The postTranscriptionRequest call in media-understanding-*.js uses the SSRF-guarded fetch with DNS pinning enabled. The pinned DNS dispatcher corrupts the multipart boundary in FormData requests, preventing the API from parsing the file and model fields.

Fix Action

Workaround

Patch media-understanding-*.js to add pinDns: false to the postTranscriptionRequest fetch options, matching the existing Google image generation pattern.

PR fix notes

PR #62174: fix(audio): disable DNS pinning for multipart audio transcription requests

Repository: openclaw/openclaw
Author: ketani23
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/62174

Description (problem / solution / changelog)

The SSRF guard's pinned DNS dispatcher corrupts multipart FormData boundaries when posting to OpenAI's /audio/transcriptions endpoint, causing the API to reject requests with "you must provide a model parameter" even though the model field is present in the form data.

This is the same class of issue already acknowledged for other multipart provider paths. Adding pinDns: false to the postTranscriptionRequest call in the OpenAI-compatible audio transcription path bypasses the broken dispatcher for these requests.

Reproduction

Configure tools.media.audio.enabled: true with OpenAI whisper-1
Send a voice note via Telegram
Transcription silently fails — multipart boundary is corrupted by the pinned DNS dispatcher
Direct curl to the same endpoint with the same file works fine

Fix

One-line addition: pass pinDns: false to postTranscriptionRequest in openai-compatible-audio.ts, matching the pattern used for other multipart provider paths.

Fixes #62173

Changed files

src/media-understanding/openai-compatible-audio.ts (modified, +1/-0)

RAW_BUFFERClick to expand / collapse

Voice note transcription via OpenAI Whisper API fails on 2026.4.5. The fetchWithSsrFGuard function uses undici's fetch with a pinned DNS dispatcher, which corrupts multipart FormData boundaries. OpenAI rejects the request with "you must provide a model parameter" even though it's present in the form data.

This is the same class of bug already fixed for Google image generation (pinDns: false), just not applied to the audio transcription code path in postTranscriptionRequest.

Steps to Reproduce

Configure tools.media.audio.enabled: true with models: [{provider: "openai", model: "whisper-1"}]
Set OPENAI_API_KEY in environment
Send a voice note via Telegram
Transcription silently fails — no transcript is injected into the message

Root Cause

Workaround

Patch media-understanding-*.js to add pinDns: false to the postTranscriptionRequest fetch options, matching the existing Google image generation pattern.

Environment

OpenClaw: 2026.4.5 (3e72c03)
OS: Ubuntu 24.04 / Linux 6.8.0-90-generic (x64)
Node: v22.22.0
Provider: OpenAI (whisper-1)
Channel: Telegram

Note: Direct curl calls to the same OpenAI /audio/transcriptions endpoint with the same audio file and API key work perfectly — confirming the issue is in the fetch/SSRF layer, not the API or audio file.

extent analysis

TL;DR

Apply the workaround by patching media-understanding-*.js to add pinDns: false to the postTranscriptionRequest fetch options to fix the voice note transcription issue via OpenAI Whisper API.

Guidance

Verify the issue by checking if the transcription silently fails when sending a voice note via Telegram with the current configuration.
Confirm the workaround by applying the patch and checking if the transcription works as expected.
Test the postTranscriptionRequest call with pinDns: false to ensure it resolves the multipart boundary corruption issue.
Review the existing Google image generation code path to ensure consistency in handling DNS pinning.

Example

No explicit code example is provided, but the patch involves adding pinDns: false to the postTranscriptionRequest fetch options, similar to the existing pattern in the Google image generation code.

Notes

This workaround is specific to the OpenAI Whisper API and may not apply to other providers or endpoints. The issue is isolated to the fetch/SSRF layer, and direct curl calls to the same endpoint work perfectly.

Recommendation

Apply the workaround by patching media-understanding-*.js to add pinDns: false to the postTranscriptionRequest fetch options, as this has been shown to resolve the issue in a similar code path (Google image generation).

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #autograd error #model save/load #optimization

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - ✅(Solved) Fix Audio transcription fails on 2026.4.5: SSRF guard corrupts multipart FormData for Whisper API [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

PR fix notes

PR #62174: fix(audio): disable DNS pinning for multipart audio transcription requests

Description (problem / solution / changelog)

Reproduction

Fix

Changed files

Steps to Reproduce

Root Cause

Workaround

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - ✅(Solved) Fix Audio transcription fails on 2026.4.5: SSRF guard corrupts multipart FormData for Whisper API [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

PR fix notes

PR #62174: fix(audio): disable DNS pinning for multipart audio transcription requests

Description (problem / solution / changelog)

Reproduction

Fix

Changed files

Steps to Reproduce

Root Cause

Workaround

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING