openclaw - 💡(How to fix) Fix [Bug]: OpenClaw STT multipart FormData fails with 422 against OpenAI-compatible /audio/transcriptions endpoint, while manual multipart works [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63299Fetched 2026-04-09 07:55:37
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
labeled ×2

OpenClaw STT breaks multipart.

When using an OpenAI-compatible /v1/audio/transcriptions endpoint (self-hosted FastWhisper), OpenClaw consistently returns:

HTTP 422 failed parsing request body

At the same time: • same audio file • same endpoint • same model • same headers

→ curl returns 200 OK → OpenClaw (FormData) returns 422

Important

The issue is NOT related to: • auth • model • endpoint • audio

The issue is how OpenClaw serializes multipart

Evidence

Works: • curl • manual multipart (Node, raw body)

Fails: • OpenClaw runtime FormData

Minimal example

Fails:

const form = new FormData(); form.append("model", "whisper"); form.append("file", audioBlob, "audio.ogg");

fetch(url, { method: "POST", body: form });

Works:

const boundary = "----x";

const body = --${boundary}\r\n + Content-Disposition: form-data; name="model"\r\n\r\n + whisper\r\n + --${boundary}\r\n + Content-Disposition: form-data; name="file"; filename="audio.ogg"\r\n + Content-Type: audio/ogg\r\n\r\n + "<FILE BYTES>" + \r\n--${boundary}--\r\n;

fetch(url, { method: "POST", headers: { "Content-Type": multipart/form-data; boundary=${boundary} }, body });

Conclusion

OpenClaw generates multipart requests that some servers (FastAPI / faster-whisper) cannot parse.

Workaround

Replace FormData with manually constructed multipart → works correctly.

One line

OpenClaw FormData multipart is broken, manual multipart works.

Root Cause

OpenClaw STT breaks multipart.

When using an OpenAI-compatible /v1/audio/transcriptions endpoint (self-hosted FastWhisper), OpenClaw consistently returns:

HTTP 422 failed parsing request body

At the same time: • same audio file • same endpoint • same model • same headers

→ curl returns 200 OK → OpenClaw (FormData) returns 422

Important

The issue is NOT related to: • auth • model • endpoint • audio

The issue is how OpenClaw serializes multipart

Evidence

Works: • curl • manual multipart (Node, raw body)

Fails: • OpenClaw runtime FormData

Minimal example

Fails:

const form = new FormData(); form.append("model", "whisper"); form.append("file", audioBlob, "audio.ogg");

fetch(url, { method: "POST", body: form });

Works:

const boundary = "----x";

const body = --${boundary}\r\n + Content-Disposition: form-data; name="model"\r\n\r\n + whisper\r\n + --${boundary}\r\n + Content-Disposition: form-data; name="file"; filename="audio.ogg"\r\n + Content-Type: audio/ogg\r\n\r\n + "<FILE BYTES>" + \r\n--${boundary}--\r\n;

fetch(url, { method: "POST", headers: { "Content-Type": multipart/form-data; boundary=${boundary} }, body });

Conclusion

OpenClaw generates multipart requests that some servers (FastAPI / faster-whisper) cannot parse.

Workaround

Replace FormData with manually constructed multipart → works correctly.

One line

OpenClaw FormData multipart is broken, manual multipart works.

Fix Action

Fix / Workaround

Workaround

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

OpenClaw STT breaks multipart.

When using an OpenAI-compatible /v1/audio/transcriptions endpoint (self-hosted FastWhisper), OpenClaw consistently returns:

HTTP 422 failed parsing request body

At the same time: • same audio file • same endpoint • same model • same headers

→ curl returns 200 OK → OpenClaw (FormData) returns 422

Important

The issue is NOT related to: • auth • model • endpoint • audio

The issue is how OpenClaw serializes multipart

Evidence

Works: • curl • manual multipart (Node, raw body)

Fails: • OpenClaw runtime FormData

Minimal example

Fails:

const form = new FormData(); form.append("model", "whisper"); form.append("file", audioBlob, "audio.ogg");

fetch(url, { method: "POST", body: form });

Works:

const boundary = "----x";

const body = --${boundary}\r\n + Content-Disposition: form-data; name="model"\r\n\r\n + whisper\r\n + --${boundary}\r\n + Content-Disposition: form-data; name="file"; filename="audio.ogg"\r\n + Content-Type: audio/ogg\r\n\r\n + "<FILE BYTES>" + \r\n--${boundary}--\r\n;

fetch(url, { method: "POST", headers: { "Content-Type": multipart/form-data; boundary=${boundary} }, body });

Conclusion

OpenClaw generates multipart requests that some servers (FastAPI / faster-whisper) cannot parse.

Workaround

Replace FormData with manually constructed multipart → works correctly.

One line

OpenClaw FormData multipart is broken, manual multipart works.

Steps to reproduce

Steps to reproduce 1. Start a self-hosted OpenAI-compatible STT server (e.g. FastAPI + faster-whisper) that supports:

POST /v1/audio/transcriptions

2.	Verify the endpoint works with curl:

curl -X POST http://<server>/v1/audio/transcriptions
-H "Authorization: Bearer <token>"
-F "model=whisper-1"
-F "[email protected]"

→ returns 200 OK with valid transcription

3.	Configure OpenClaw to use this endpoint:

{ "tools": { "media": { "audio": { "enabled": true, "models": [ { "provider": "openai", "model": "whisper-1", "baseUrl": "http://<server>/v1" } ] } } } }

4.	Send a voice message (e.g. via Telegram integration)
5.	Observe the request in logs:
•	OpenClaw sends multipart request using FormData
6.	Observe server response:

HTTP 422 failed parsing request body

7.	Replace multipart request with manually constructed body (same file, same endpoint)
8.	Observe:

→ request succeeds with 200 OK and valid transcription

Expected behavior

OpenClaw should send a valid multipart request that is compatible with OpenAI /v1/audio/transcriptions endpoints.

The request should be correctly parsed by the server, and return: • HTTP 200 OK • valid transcription text

Behavior should be equivalent to: • curl multipart request • manually constructed multipart body

There should be no difference in server response between FormData-generated multipart and manually encoded multipart.

Actual behavior

Actual behavior

OpenClaw sends the STT request using runtime FormData, which results in an invalid multipart payload for some servers.

The server responds with:

HTTP 422 failed parsing request body

The request is rejected during multipart parsing, and no transcription is produced.

The same request parameters (file, model, headers, endpoint) work correctly when: • sent via curl • sent using manually constructed multipart body

This indicates that the failure is specific to how multipart is serialized when using FormData in the OpenClaw runtime.

OpenClaw version

2026.4.8

Operating system

Ubuntu 24.04

Install method

No response

Model

gpt-5.4

Provider / routing chain

openclaw->codex

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

TL;DR

The issue can be fixed by replacing OpenClaw's FormData with a manually constructed multipart request body.

Guidance

  1. Verify the issue: Confirm that the problem is indeed with OpenClaw's multipart serialization by comparing the requests sent by OpenClaw and curl.
  2. Manually construct the multipart body: Use the working example provided in the issue to construct the multipart body manually, ensuring that the boundary and content disposition are correctly set.
  3. Update OpenClaw configuration: Modify the OpenClaw configuration to use the manually constructed multipart body instead of FormData.
  4. Test the fix: Send a test request using the updated OpenClaw configuration and verify that the server responds with a 200 OK and a valid transcription.

Example

const boundary = "----x";
const body =
  `--${boundary}\r\n` +
  `Content-Disposition: form-data; name="model"\r\n\r\n` +
  `whisper\r\n` +
  `--${boundary}\r\n` +
  `Content-Disposition: form-data; name="file"; filename="audio.ogg"\r\n` +
  `Content-Type: audio/ogg\r\n\r\n` +
  "<FILE BYTES>" +
  `\r\n--${boundary}--\r\n`;

fetch(url, {
  method: "POST",
  headers: {
    "Content-Type": `multipart/form-data; boundary=${boundary}`
  },
  body
});

Notes

This fix assumes that the issue is indeed with OpenClaw's FormData serialization and that the manually constructed multipart body is correctly formatted. If the issue persists, further debugging may be necessary to identify the root cause.

Recommendation

Apply the workaround by replacing OpenClaw's FormData with a manually constructed multipart request body, as this has been shown to work correctly in the provided example.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

OpenClaw should send a valid multipart request that is compatible with OpenAI /v1/audio/transcriptions endpoints.

The request should be correctly parsed by the server, and return: • HTTP 200 OK • valid transcription text

Behavior should be equivalent to: • curl multipart request • manually constructed multipart body

There should be no difference in server response between FormData-generated multipart and manually encoded multipart.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING