litellm - ✅(Solved) Fix [Bug]: /v1/audio/speech with stream_format=sse returns raw audio for OpenAI-compatible TTS backend instead of text/event-stream [1 pull requests, 1 comments, 1 participants]

litellm2026-03-21 17:26:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24301•Fetched 2026-04-08 01:13:29

View on GitHub

Comments

Participants

Timeline

Reactions

Author

tg-bomze

Participants

tg-bomze

Timeline (top)

labeled ×3cross-referenced ×1referenced ×1

Error Message

Observe that the response does not behave like OpenAI speech SSE. In my setup, LiteLLM returns HTTP 500 Internal Server Error for this request instead of an SSE stream.

Fix Action

Fixed

Fixed by PR: fix: forward stream_format param in audio speech endpoint (https://github.com/BerriAI/litellm/pull/24353)

PR fix notes

PR #24353: fix: forward stream_format param in audio speech endpoint

Repository: BerriAI/litellm
Author: themavik
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/24353

Description (problem / solution / changelog)

Summary

Fixes #24301.

Root cause: stream_format from the request body never reached optional_params, so the OpenAI speech call ignored it. The proxy always returned audio/mpeg regardless.

Fix: Forward stream_format into optional_params after provider mapping, and set Content-Type to text/event-stream when stream_format is sse.

Changes

litellm/main.py: pass stream_format from kwargs into optional_params
litellm/proxy/proxy_server.py: override media_type to text/event-stream for SSE requests

Testing

Verified the param flows through to optional_params
SSE requests now get the correct content type header

Changed files

litellm/main.py (modified, +18/-13)
litellm/proxy/proxy_server.py (modified, +215/-4)

Code Example

curl -i -sS "https://<litellm-host>/audio/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <litellm-key>" \
  -d '{
    "input":"Hello from gpt-4o-mini-tts.",
    "voice":"alloy",
    "model":"gpt-4o-mini-tts",
    "response_format":"pcm"
  }'

---

curl -i -sS "https://<litellm-host>/audio/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <litellm-key>" \
  -d '{
    "input":"Hello from gpt-4o-mini-tts.",
    "voice":"alloy",
    "model":"gpt-4o-mini-tts",
    "response_format":"pcm",
    "stream_format":"sse"
  }'

---

Working non-streaming request through LiteLLM:

HTTP/2 200
content-type: audio/mpeg
x-litellm-model-api-base: https://api.openai.com
x-litellm-response-cost: 2.75e-05
x-litellm-version: 1.81.0


Unexpected response for stream_format="sse" through LiteLLM:

HTTP/2 200
content-type: audio/mpeg
x-litellm-version: 1.81.0
<raw audio bytes...>


Expected SSE shape:

HTTP/1.1 200 OK
content-type: text/event-stream; charset=utf-8

data: {"type":"speech.audio.delta","audio":"..."}
data: {"type":"speech.audio.done","usage":{...}}

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

LiteLLM proxy handles normal /v1/audio/speech requests correctly, but stream_format="sse" does not behave correctly.

I verified that a simple non-streaming TTS request through LiteLLM works:

model="gpt-4o-mini-tts"
no explicit stream_format
response: 200 OK
x-litellm-model-api-base: https://api.openai.com

However, when testing /v1/audio/speech with stream_format="sse" against an OpenAI-compatible TTS backend behind LiteLLM, the proxy does not preserve SSE behavior.

Instead of returning Content-Type: text/event-stream and data: {...} events, the proxy returns a binary audio response.

Expected behavior:

stream_format="sse" should return an SSE stream of audio events. Actual behavior:
LiteLLM returns a normal binary audio response instead of SSE.

Steps to Reproduce

Verify that normal TTS works through LiteLLM:

curl -i -sS "https://<litellm-host>/audio/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <litellm-key>" \
  -d '{
    "input":"Hello from gpt-4o-mini-tts.",
    "voice":"alloy",
    "model":"gpt-4o-mini-tts",
    "response_format":"pcm"
  }'

Verify that the same endpoint behaves incorrectly when SSE is requested:

curl -i -sS "https://<litellm-host>/audio/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <litellm-key>" \
  -d '{
    "input":"Hello from gpt-4o-mini-tts.",
    "voice":"alloy",
    "model":"gpt-4o-mini-tts",
    "response_format":"pcm",
    "stream_format":"sse"
  }'

Observe that the response does not behave like OpenAI speech SSE. In my setup, LiteLLM returns HTTP 500 Internal Server Error for this request instead of an SSE stream.
Compare this with an OpenAI-compatible upstream TTS backend that supports SSE directly: when called without LiteLLM, the same /audio/speech request shape returns Content-Type: text/event-stream and data: {"type":"speech.audio.delta", ...} events.

Relevant log output

Working non-streaming request through LiteLLM:

HTTP/2 200
content-type: audio/mpeg
x-litellm-model-api-base: https://api.openai.com
x-litellm-response-cost: 2.75e-05
x-litellm-version: 1.81.0


Unexpected response for stream_format="sse" through LiteLLM:

HTTP/2 200
content-type: audio/mpeg
x-litellm-version: 1.81.0
<raw audio bytes...>


Expected SSE shape:

HTTP/1.1 200 OK
content-type: text/event-stream; charset=utf-8

data: {"type":"speech.audio.delta","audio":"..."}
data: {"type":"speech.audio.done","usage":{...}}

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

v1.81.0

Twitter / LinkedIn details

https://x.com/tg_bomze

extent analysis

Fix Plan

To fix the issue with LiteLLM not preserving SSE behavior when stream_format="sse", we need to modify the LiteLLM proxy to handle SSE requests correctly.

Here are the steps:

Update the LiteLLM proxy to check for the stream_format parameter in the request.
If stream_format="sse", set the Content-Type header to text/event-stream and return the audio events as SSE events.
Use a library like flask or fastapi to handle the SSE events.

Example code using fastapi:

from fastapi import FastAPI, Response
from fastapi.responses import StreamingResponse
import json

app = FastAPI()

@app.post("/audio/speech")
async def speech(request: Request):
    # Check for stream_format parameter
    if request.json().get("stream_format") == "sse":
        # Set Content-Type header to text/event-stream
        headers = {"Content-Type": "text/event-stream"}
        
        # Generate SSE events
        async def event_generator():
            # Yield SSE events
            yield "data: {}\n\n".format(json.dumps({"type": "speech.audio.delta", "audio": "..."}))
            yield "data: {}\n\n".format(json.dumps({"type": "speech.audio.done", "usage": {...}}))
        
        # Return SSE events as a StreamingResponse
        return StreamingResponse(event_generator(), headers=headers)
    else:
        # Handle non-SSE requests
        return Response(content="...", media_type="audio/mpeg")

Verification

To verify that the fix worked, test the /audio/speech endpoint with stream_format="sse" and check that the response is an SSE stream with the correct events.

Example test using curl:

curl -i -sS "https://<litellm-host>/audio/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <litellm-key>" \
  -d '{
    "input":"Hello from gpt-4o-mini-tts.",
    "voice":"alloy",
    "model":"gpt-4o-mini-tts",
    "response_format":"pcm",
    "stream_format":"sse"
  }'

This should return an SSE stream with the correct events:

HTTP/1.1 200 OK
content-type: text/event-stream; charset=utf-8

data: {"type":"speech.audio.delta","audio":"..."}
data: {"type":"speech.audio.done","usage":{...}}

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #LLM response #container setup #orchestration issue #cache issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - ✅(Solved) Fix [Bug]: /v1/audio/speech with stream_format=sse returns raw audio for OpenAI-compatible TTS backend instead of text/event-stream [1 pull requests, 1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #24353: fix: forward stream_format param in audio speech endpoint

Description (problem / solution / changelog)

Summary

Changes

Testing

Changed files

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

Fix Plan

Verification

Still need to ship something?

TRENDING

litellm - ✅(Solved) Fix [Bug]: /v1/audio/speech with stream_format=sse returns raw audio for OpenAI-compatible TTS backend instead of text/event-stream [1 pull requests, 1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #24353: fix: forward stream_format param in audio speech endpoint

Description (problem / solution / changelog)

Summary

Changes

Testing

Changed files

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

Fix Plan

Verification

Still need to ship something?

RELATED_DISCOVERY

TRENDING