litellm - 💡(How to fix) Fix [Bug]: router.aspeech() bypasses async_function_with_fallbacks — TTS requests have no retry or failover

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  • No retry on failure — if the selected deployment returns an error, the request fails immediately with no retry attempt.
  • No failover to other deployments — if you have multiple TTS deployments under the same model_name, only the first deployment selected by async_get_available_deployment is ever tried. Any error surfaces to the caller.

Root Cause

litellm/router.pyaspeech() (around line 3542):

async def aspeech(self, model, input, voice, **kwargs):
    ...
    deployment = await self.async_get_available_deployment(...)
    self._update_kwargs_before_fallbacks(model=model, kwargs=kwargs)
    ...
    response = await litellm.aspeech(...)   # ← direct call, no fallback loop
    return response

Compare to aembedding or arerank, which set kwargs["original_function"] and call await self.async_function_with_fallbacks(**kwargs). aspeech has no such wrapper.

Code Example

async def aspeech(self, model, input, voice, **kwargs):
    ...
    deployment = await self.async_get_available_deployment(...)
    self._update_kwargs_before_fallbacks(model=model, kwargs=kwargs)
    ...
    response = await litellm.aspeech(...)   # ← direct call, no fallback loop
    return response
RAW_BUFFERClick to expand / collapse

What happened

Router.aspeech() bypasses async_function_with_fallbacks and calls litellm.aspeech directly. As a result:

  • No retry on failure — if the selected deployment returns an error, the request fails immediately with no retry attempt.
  • No failover to other deployments — if you have multiple TTS deployments under the same model_name, only the first deployment selected by async_get_available_deployment is ever tried. Any error surfaces to the caller.

This is inconsistent with every other router method (acompletion, aembedding, arerank, atext_completion, etc.) which all route through async_function_with_fallbacks.

Root cause

litellm/router.pyaspeech() (around line 3542):

async def aspeech(self, model, input, voice, **kwargs):
    ...
    deployment = await self.async_get_available_deployment(...)
    self._update_kwargs_before_fallbacks(model=model, kwargs=kwargs)
    ...
    response = await litellm.aspeech(...)   # ← direct call, no fallback loop
    return response

Compare to aembedding or arerank, which set kwargs["original_function"] and call await self.async_function_with_fallbacks(**kwargs). aspeech has no such wrapper.

Impact

Any operator running multiple TTS providers (e.g. OpenAI TTS + Azure TTS) for redundancy gets no automatic failover. A single deployment outage causes 100% of TTS requests to fail even if a healthy backup is configured.

Expected behavior

aspeech() should route through the same retry/fallover mechanism as other router methods — either by:

  1. Adding an _aspeech inner method that implements the actual litellm.aspeech call, setting kwargs["original_function"] = self._aspeech, and then calling await self.async_function_with_fallbacks(**kwargs), or
  2. Wrapping the direct call in retry logic consistent with the rest of the router.

Related

This is a separate issue from #27390, which covers the missing _update_kwargs_with_deployment call (TTS spend tracking bug). Both need fixing independently.

LiteLLM version

Reproduced against main branch.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

aspeech() should route through the same retry/fallover mechanism as other router methods — either by:

  1. Adding an _aspeech inner method that implements the actual litellm.aspeech call, setting kwargs["original_function"] = self._aspeech, and then calling await self.async_function_with_fallbacks(**kwargs), or
  2. Wrapping the direct call in retry logic consistent with the rest of the router.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [Bug]: router.aspeech() bypasses async_function_with_fallbacks — TTS requests have no retry or failover