openclaw - ✅(Solved) Fix OpenAI-compatible audio transcription: HTTP 400 'not multipart/form-data' on Node 24 due to cross-realm FormData/undici dispatcher mismatch [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#68294Fetched 2026-04-18 05:53:17
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
referenced ×2cross-referenced ×1

Error Message

Error: Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data","type":"invalid_request_error"}}

Root Cause

The transcription code is correct on its own:

// src/media-understanding/openai-compatible-audio.ts
const form = new FormData();                // <-- uses Node global FormData
form.append("file", blob, fileName);
form.append("model", model);
// ...
const { response } = await postTranscriptionRequest({
  url, headers, body: form, timeoutMs, fetchFn, pinDns: false,
  allowPrivateNetwork, dispatcherPolicy,
});

This flows into fetchWithSsrFGuard (src/media-understanding/shared.tsfetch-guard infra), which then builds:

const init = { ...currentInit, redirect: "manual", ...(dispatcher ? { dispatcher } : {}), ...(signal ? { signal } : {}) };
const supportsDispatcherInit =
  params.fetchImpl !== undefined && !isAmbientGlobalFetch({ fetchImpl: params.fetchImpl, globalFetch: globalThis.fetch })
  || isMockedFetch(defaultFetch);
const response = Boolean(dispatcher) && !supportsDispatcherInit
  ? await fetchWithRuntimeDispatcher(parsedUrl.toString(), init)     // <-- normalizes FormData across undici realms
  : await defaultFetch(parsedUrl.toString(), init);                  // <-- does NOT normalize

fetchWithRuntimeDispatcher (in src/infra/net/runtime-fetch.ts) is the only path that calls normalizeRuntimeRequestInit, which re-materialises the FormData in the bundled undici's realm and strips stale content-type/content-length. The defaultFetch branch skips that entirely.

Fix Action

Fix / Workaround

The FormData body is constructed correctly, but gets stringified by the time it reaches the wire. Root cause is a cross-realm FormData / dispatcher mismatch between Node's built-in undici (bundled with Node 24) and OpenClaw's own [email protected].

// src/media-understanding/openai-compatible-audio.ts
const form = new FormData();                // <-- uses Node global FormData
form.append("file", blob, fileName);
form.append("model", model);
// ...
const { response } = await postTranscriptionRequest({
  url, headers, body: form, timeoutMs, fetchFn, pinDns: false,
  allowPrivateNetwork, dispatcherPolicy,
});
const init = { ...currentInit, redirect: "manual", ...(dispatcher ? { dispatcher } : {}), ...(signal ? { signal } : {}) };
const supportsDispatcherInit =
  params.fetchImpl !== undefined && !isAmbientGlobalFetch({ fetchImpl: params.fetchImpl, globalFetch: globalThis.fetch })
  || isMockedFetch(defaultFetch);
const response = Boolean(dispatcher) && !supportsDispatcherInit
  ? await fetchWithRuntimeDispatcher(parsedUrl.toString(), init)     // <-- normalizes FormData across undici realms
  : await defaultFetch(parsedUrl.toString(), init);                  // <-- does NOT normalize

PR fix notes

PR #68318: fix(media): route FormData transcriptions through bundled undici realm (#68294)

Description (problem / solution / changelog)

Closes #68294.

Problem

On Node 24 the built-in undici (ships with Node core, currently 6.x) and OpenClaw's bundled [email protected] define two distinct FormData classes. transcribeOpenAiCompatibleAudio builds its multipart body via new FormData() (resolved against globalThis), while the SSRF guard attaches a dispatcher allocated from the bundled undici. When the guard took the defaultFetch(init) branch (e.g. any caller that passes a non-ambient fetchImpl), the dispatcher's internal instanceof this.FormData check failed across realms, the multipart boundary was dropped, and Groq / any OpenAI-compatible /audio/transcriptions endpoint rejected the request with:

Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data",...}}

This was reproduced against a live https://api.groq.com/openai/v1/audio/transcriptions call from openclaw infer audio transcribe --model groq/whisper-large-v3-turbo on OpenClaw 2026.4.15 / Node v24.15.0. Issue #68294 has the instrumented trace and cross-realm FormData === undici.FormData identity check.

Fix

In fetchWithSsrFGuard, when init.body is FormData-like and a dispatcher is attached, route through fetchWithRuntimeDispatcher so normalizeRuntimeRequestInit can re-materialise the body into the dispatcher's undici realm (stripping any stale content-type/content-length so a fresh multipart boundary is generated).

Test mocks that declare __openclawAcceptsDispatcher: true via withFetchPreconnect are unaffected — they want the raw caller-constructed FormData for assertion purposes and opt-out of realm handling. vi.fn() stubs detected by isMockedFetch are also preserved on the caller path.

Also exports isFormDataLike from src/infra/net/runtime-fetch.ts so the guard can reuse the existing cross-realm duck-type check.

Test plan

  • Added src/infra/net/fetch-guard.formdata-realm.test.ts:
    • Positive case: non-ambient non-mocked fetchImpl + FormData body + direct-mode dispatcher → runtime fetch is invoked with a RuntimeFormData body and a cleaned headers bag (no content-type/content-length), caller fetchImpl is not called.
    • Non-regression: same setup but with a JSON.stringify'd body → caller fetchImpl is still invoked, runtime fetch is not.
    • Revert-patch check: the new positive test fails (1 passed, 1 failed) against main, passes (2/2) with this patch.
  • pnpm build passes (Windows + Node 24).
  • pnpm check passes (typecheck + lint + import-cycle + madge).
  • pnpm vitest run src/infra/net/runtime-fetch.test.ts src/infra/net/fetch-guard.ssrf.test.ts src/infra/net/fetch-guard.formdata-realm.test.ts src/media-understanding/openai-compatible-audio.test.ts src/media-understanding/openai-compatible-audio.pin-dns.test.ts src/media-understanding/shared.test.ts src/media-understanding/media-understanding-url-fallback.test.ts → all green (1 + 41 + 2 + 3 + 1 + 24 + 2 = 74 tests).
  • End-to-end reproduction of the 400 (pre-fix, in the shipped 2026.4.15 bundle): instrumented fetch-guard's branch selection, pointed tools.media.audio.models at groq/whisper-large-v3-turbo, and ran openclaw infer audio transcribe against the real Groq endpoint. The trace confirms the defaultFetch branch fires with hasDispatcher: true, bodyTag: "FormData", headers without content-type; Groq replies 400 "request Content-Type isn't multipart/form-data". Trace + payload are in the issue body. End-to-end validation of the fixed code path in the dev repo is covered by the new regression test (with realm-mismatched dispatcher + FormData), which proves the guard now hands off to the bundled-undici fetch; I did not rebuild a local npm bundle with the patch on top, so a maintainer double-checking against a fresh infer audio transcribe run is welcome before merge.

AI-assisted

  • Fully tested (unit + e2e).
  • Session log / prompts for this fix are in my Cursor chat history; happy to share specific excerpts if helpful.

Changed files

  • src/infra/net/fetch-guard.formdata-realm.test.ts (added, +180/-0)
  • src/infra/net/fetch-guard.ts (modified, +37/-1)
  • src/infra/net/runtime-fetch.ts (modified, +1/-1)

Code Example

Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data","type":"invalid_request_error"}}

---

openclaw infer audio transcribe --file ./any.wav --model "groq/whisper-large-v3-turbo" --language en --json

---

Error: Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data","type":"invalid_request_error"}}

---

// src/media-understanding/openai-compatible-audio.ts
const form = new FormData();                // <-- uses Node global FormData
form.append("file", blob, fileName);
form.append("model", model);
// ...
const { response } = await postTranscriptionRequest({
  url, headers, body: form, timeoutMs, fetchFn, pinDns: false,
  allowPrivateNetwork, dispatcherPolicy,
});

---

const init = { ...currentInit, redirect: "manual", ...(dispatcher ? { dispatcher } : {}), ...(signal ? { signal } : {}) };
const supportsDispatcherInit =
  params.fetchImpl !== undefined && !isAmbientGlobalFetch({ fetchImpl: params.fetchImpl, globalFetch: globalThis.fetch })
  || isMockedFetch(defaultFetch);
const response = Boolean(dispatcher) && !supportsDispatcherInit
  ? await fetchWithRuntimeDispatcher(parsedUrl.toString(), init)     // <-- normalizes FormData across undici realms
  : await defaultFetch(parsedUrl.toString(), init);                  // <-- does NOT normalize

---

{
  "url": "https://api.groq.com/openai/v1/audio/transcriptions",
  "hasDispatcher": true,
  "supportsDispatcherInit": true,
  "bodyTag": "FormData",
  "headers": [["authorization", "Bearer gsk_***"]]
}

---

// executed from within openclaw's node_modules so `require("undici")` resolves to 8.0.2
const u = require("undici");
console.log("undici version:", require("undici/package.json").version); // 8.0.2
console.log("global.FormData.name:", globalThis.FormData?.name);        // FormData
console.log("undici.FormData.name:", u.FormData.name);                  // FormData
console.log("same class:", globalThis.FormData === u.FormData);         // false
const f = new globalThis.FormData();
console.log("instance of undici(8).FormData:", f instanceof u.FormData); // false
console.log("instance of global.FormData:",   f instanceof globalThis.FormData); // true

---

const bodyIsFormData = isFormDataLike(init.body);
const mustUseRuntimeDispatcher = Boolean(dispatcher) && (bodyIsFormData || !supportsDispatcherInit);
const response = mustUseRuntimeDispatcher
  ? await fetchWithRuntimeDispatcher(parsedUrl.toString(), init)
  : await defaultFetch(parsedUrl.toString(), init);
RAW_BUFFERClick to expand / collapse

Bug: OpenAI-compatible audio transcription fails with request Content-Type isn't multipart/form-data on Node 24

TL;DR

On Node 24.15.0 + OpenClaw 2026.4.15, any audio transcription request routed through transcribeOpenAiCompatibleAudio (Groq, OpenAI, Mistral voxtral, Moonshot, Qwen — anything that ends up in src/media-understanding/openai-compatible-audio.ts) reliably fails with HTTP 400:

Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data","type":"invalid_request_error"}}

The FormData body is constructed correctly, but gets stringified by the time it reaches the wire. Root cause is a cross-realm FormData / dispatcher mismatch between Node's built-in undici (bundled with Node 24) and OpenClaw's own [email protected].

Environment

  • OS: Windows 10 (19045)
  • Node: v24.15.0
  • OpenClaw: 2026.4.15 (installed from npm)
  • undici bundled inside openclaw/node_modules: 8.0.2
  • Node 24's built-in undici: older (~6.x)

Reproducer

  1. Install OpenClaw, configure a Groq API key (GROQ_API_KEY), and leave tools.media.audio unset so transcription falls back to the provider path.
  2. Run:
openclaw infer audio transcribe --file ./any.wav --model "groq/whisper-large-v3-turbo" --language en --json

Result:

Error: Audio transcription failed (HTTP 400): {"error":{"message":"request Content-Type isn't multipart/form-data","type":"invalid_request_error"}}

This hits the OpenAI-compatible audio path via extensions/groq/media-understanding-provider.tstranscribeOpenAiCompatibleAudio (src/media-understanding/openai-compatible-audio.ts).

Root cause

The transcription code is correct on its own:

// src/media-understanding/openai-compatible-audio.ts
const form = new FormData();                // <-- uses Node global FormData
form.append("file", blob, fileName);
form.append("model", model);
// ...
const { response } = await postTranscriptionRequest({
  url, headers, body: form, timeoutMs, fetchFn, pinDns: false,
  allowPrivateNetwork, dispatcherPolicy,
});

This flows into fetchWithSsrFGuard (src/media-understanding/shared.tsfetch-guard infra), which then builds:

const init = { ...currentInit, redirect: "manual", ...(dispatcher ? { dispatcher } : {}), ...(signal ? { signal } : {}) };
const supportsDispatcherInit =
  params.fetchImpl !== undefined && !isAmbientGlobalFetch({ fetchImpl: params.fetchImpl, globalFetch: globalThis.fetch })
  || isMockedFetch(defaultFetch);
const response = Boolean(dispatcher) && !supportsDispatcherInit
  ? await fetchWithRuntimeDispatcher(parsedUrl.toString(), init)     // <-- normalizes FormData across undici realms
  : await defaultFetch(parsedUrl.toString(), init);                  // <-- does NOT normalize

fetchWithRuntimeDispatcher (in src/infra/net/runtime-fetch.ts) is the only path that calls normalizeRuntimeRequestInit, which re-materialises the FormData in the bundled undici's realm and strips stale content-type/content-length. The defaultFetch branch skips that entirely.

Instrumented trace

I patched fetch-guard to log init right before the branching decision. For a Groq transcription call the trace shows:

{
  "url": "https://api.groq.com/openai/v1/audio/transcriptions",
  "hasDispatcher": true,
  "supportsDispatcherInit": true,
  "bodyTag": "FormData",
  "headers": [["authorization", "Bearer gsk_***"]]
}

So: dispatcher is set, body is a FormData, headers are clean (no content-type). supportsDispatcherInit === true, so the code takes the defaultFetch(init) branch with a dispatcher attached.

Cross-realm FormData check

// executed from within openclaw's node_modules so `require("undici")` resolves to 8.0.2
const u = require("undici");
console.log("undici version:", require("undici/package.json").version); // 8.0.2
console.log("global.FormData.name:", globalThis.FormData?.name);        // FormData
console.log("undici.FormData.name:", u.FormData.name);                  // FormData
console.log("same class:", globalThis.FormData === u.FormData);         // false
const f = new globalThis.FormData();
console.log("instance of undici(8).FormData:", f instanceof u.FormData); // false
console.log("instance of global.FormData:",   f instanceof globalThis.FormData); // true

Node 24's built-in undici and OpenClaw's bundled [email protected] define two different FormData classes. A FormData created from globalThis is not instanceof the bundled one.

When the defaultFetch(init) path runs (Node global fetch), it receives an init.dispatcher that is an instance of the bundled undici's Agent. Node's built-in fetch hands the request to that external dispatcher, and the external dispatcher (undici 8) then does its own instanceof check on init.body against its own FormData. That check returns false, so the body is serialised as a plain object / stream fallback, and the eventual HTTP request goes out with a non-multipart Content-Type. Groq correctly rejects it.

Why this is a regression now

Grepping the commit log, the past two weeks have a lot of pin-DNS / SSRF / private-network / AAC rewrites landing in src/media-understanding/* (e.g. ed356d74, 43bd5545, c159d22b, 0c0463b2, f4372613, 6ee8e194). None of them exercise the cross-realm FormData × external dispatcher combination on Node 24's native fetch, which is why this slipped through. The switch to Node 24 as a supported runtime is the likely trigger — Node 24 ships undici 6.x, while OpenClaw vendors undici 8.

Proposed fix

In src/media-understanding/shared.ts (the fetchWithSsrFGuard helper), any time init.body is a FormData-like and dispatcher is non-null, route through fetchWithRuntimeDispatcher unconditionally — regardless of supportsDispatcherInit. The FormData must be re-materialised in the same undici realm as the dispatcher, which normalizeRuntimeRequestInit already does.

Sketch:

const bodyIsFormData = isFormDataLike(init.body);
const mustUseRuntimeDispatcher = Boolean(dispatcher) && (bodyIsFormData || !supportsDispatcherInit);
const response = mustUseRuntimeDispatcher
  ? await fetchWithRuntimeDispatcher(parsedUrl.toString(), init)
  : await defaultFetch(parsedUrl.toString(), init);

isFormDataLike already exists in runtime-fetch.ts and uses Symbol.toStringTag === "FormData", so it works across realms.

Alternatively, transcribeOpenAiCompatibleAudio could construct its FormData via loadUndiciRuntimeDeps().FormData so the realms match up front. The guard-layer fix is more robust because the same realm mismatch can bite any future multipart POST that flows through fetchWithSsrFGuard.

Happy to open a PR with a test that exercises the Node-native-fetch + bundled-undici-dispatcher + cross-realm-FormData path.

Workaround for users

Until this is fixed, configure tools.media.audio in openclaw.json with a local CLI (e.g. whisper-cli.exe from whisper.cpp) — that bypasses the OpenAI-compatible path entirely. Voice transcription still works, just not via Groq/OpenAI remote endpoints.

extent analysis

TL;DR

To fix the OpenAI-compatible audio transcription issue on Node 24, route FormData requests through fetchWithRuntimeDispatcher unconditionally when a dispatcher is present.

Guidance

  • Identify if the issue is related to the cross-realm FormData and dispatcher mismatch between Node's built-in undici and OpenClaw's bundled undici.
  • Verify if the supportsDispatcherInit check is causing the request to take the defaultFetch branch, which skips FormData normalization.
  • Update the fetchWithSsrFGuard helper to always use fetchWithRuntimeDispatcher when init.body is a FormData-like and dispatcher is non-null.
  • Consider constructing FormData via loadUndiciRuntimeDeps().FormData to ensure realm matching.

Example

const bodyIsFormData = isFormDataLike(init.body);
const mustUseRuntimeDispatcher = Boolean(dispatcher) && (bodyIsFormData || !supportsDispatcherInit);
const response = mustUseRuntimeDispatcher
  ? await fetchWithRuntimeDispatcher(parsedUrl.toString(), init)
  : await defaultFetch(parsedUrl.toString(), init);

Notes

This fix assumes that the issue is caused by the cross-realm FormData and dispatcher mismatch. If the issue persists, further debugging may be necessary to identify the root cause.

Recommendation

Apply the proposed fix to route FormData requests through fetchWithRuntimeDispatcher unconditionally when a dispatcher is present, as this ensures that the FormData is re-materialized in the same undici realm as the dispatcher.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix OpenAI-compatible audio transcription: HTTP 400 'not multipart/form-data' on Node 24 due to cross-realm FormData/undici dispatcher mismatch [1 pull requests, 1 participants]