litellm - 💡(How to fix) Fix [Feature]: add support for vLLM realtime endpoint [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#23102Fetched 2026-04-08 00:38:37
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×3

Fix Action

Fix / Workaround

📝 Workaround so far is to setup provider OpenAI with custom model and custom endpoint.

RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

The Feature

vLLM supports /v1/realtime https://docs.vllm.ai/en/latest/serving/openai_compatible_server/?h=realtime#realtime-api

📝 Workaround so far is to setup provider OpenAI with custom model and custom endpoint.

Motivation, pitch

Goal is to setup Voxtral Realtime model https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602

What part of LiteLLM is this about?

SDK (litellm Python package)

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

No

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To integrate the Voxtral Realtime model with LiteLLM, follow these steps:

  • Setup a custom provider with the Voxtral Realtime model
  • Configure the custom endpoint for the model
  • Update the LiteLLM SDK to use the custom provider and endpoint

Code Changes

Example code snippet to setup the custom provider and endpoint:

import litellm

# Define the custom provider and endpoint
provider = litellm.Provider(
    name="Voxtral Realtime",
    model="mistralai/Voxtral-Mini-4B-Realtime-2602",
    endpoint="https://your-custom-endpoint.com/v1/inference"
)

# Update the LiteLLM SDK to use the custom provider and endpoint
litellm.setup_provider(provider)

Replace "https://your-custom-endpoint.com/v1/inference" with your actual custom endpoint URL.

Verification

To verify the fix, test the LiteLLM SDK with the custom provider and endpoint:

# Test the LiteLLM SDK with the custom provider and endpoint
response = litellm.query("Your test query", provider=provider)
print(response)

If the response is successful, the fix has worked.

Extra Tips

  • Ensure the custom endpoint is correctly configured to handle the Voxtral Realtime model's requirements.
  • Test the custom provider and endpoint thoroughly to avoid any regressions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [Feature]: add support for vLLM realtime endpoint [1 participants]