litellm - ✅(Solved) Fix [Bug]: `MidStreamFallbackError` crashes with `ValueError: invalid literal for int() with base 10: 'litellm_error'` [1 pull requests, 1 participants]

litellm2026-04-30 19:18:56

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#26913•Fetched 2026-05-01 05:34:26

View on GitHub

Comments

Participants

Timeline

Reactions

Author

yogeek

Participants

yogeek

Timeline (top)

labeled ×3cross-referenced ×1

When a CustomLLM handler raises RateLimitError during a streaming request, LiteLLM's MidStreamFallbackError.__init__ crashes because it receives the string 'litellm_error' as original_status and tries to cast it to int.

The crash is in litellm/exceptions.py:957:

self.status_code = int(original_status) if original_status is not None else 503

This returns HTTP 500 to the client instead of triggering the fallback chain.

Error Message

File "litellm/litellm_core_utils/streaming_handler.py", line 2181, in __anext__
    self._handle_stream_fallback_error(e)
File "litellm/litellm_core_utils/streaming_handler.py", line 2249, in _handle_stream_fallback_error
    raise MidStreamFallbackError(
        message=str(mapped_exception),
        ...
        is_pre_first_chunk=not self.sent_first_chunk,
    )
File "litellm/exceptions.py", line 957, in __init__
    self.status_code = int(original_status) if original_status is not None else 503
ValueError: invalid literal for int() with base 10: 'litellm_error'

Root Cause

Fix Action

Workaround

In the custom handler's streaming/astreaming methods, catch RateLimitError and re-raise as litellm.APIError(status_code=429, ...) which has a proper integer status code.

PR fix notes

PR #26925: Fix mid-stream fallback status handling

Repository: BerriAI/litellm
Author: Genmin
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/26925

Description (problem / solution / changelog)

Summary

make MidStreamFallbackError tolerate malformed exception status_code values such as "litellm_error"
fall back to original_exception.response.status_code when it is valid
otherwise default to 503 without raising during fallback construction
add regression coverage for malformed status values and response-status fallback

Why

Streaming fallback handling can receive mapped exceptions whose status_code is not an HTTP integer. MidStreamFallbackError cast that value directly in two places, which could turn a retriable mid-stream provider error into a client-facing 500 before the router fallback chain gets a chance to run.

Fixes #26913.

Validation

/tmp/litellm-26913-venv/bin/python -m pytest tests/test_litellm/test_exception_header_preservation.py -q
- 17 passed, 2 warnings
UV_NO_PROJECT=1 uvx black litellm/exceptions.py tests/test_litellm/test_exception_header_preservation.py --check --diff
- 2 files would be left unchanged

Note: project uv run is blocked locally because the installed uv is 0.10.4 and this branch requires >=0.10.9, so I used a Python 3.13 venv for the targeted test.

Changed files

litellm/exceptions.py (modified, +21/-3)
litellm/proxy/_lazy_openapi_snapshot.py (modified, +30/-2)
tests/test_litellm/proxy/test_lazy_openapi_snapshot.py (added, +53/-0)
tests/test_litellm/test_exception_header_preservation.py (modified, +45/-0)

Code Example

self.status_code = int(original_status) if original_status is not None else 503

---

from litellm import CustomLLM, RateLimitError

class MyHandler(CustomLLM):
    async def acompletion(self, *args, **kwargs):
        raise RateLimitError(
            message="Key exhausted",
            model="my-model",
            llm_provider="openai",
        )

    async def astreaming(self, *args, **kwargs):
        # This triggers the bug: RateLimitError propagates through
        # the streaming pipeline and hits MidStreamFallbackError
        await self.acompletion(*args, **kwargs)
        yield  # never reached

---

model_list:
  - model_name: my-model
    litellm_params:
      model: my-provider/my-model
  - model_name: fallback-model
    litellm_params:
      model: openai/gpt-4

litellm_settings:
  custom_provider_map:
    - {"provider": "my-provider", "custom_handler": my_handler}
  fallbacks:
    - my-model:
        - fallback-model

---

File "litellm/exceptions.py", line 957, in __init__
    self.status_code = int(original_status) if original_status is not None else 503
ValueError: invalid literal for int() with base 10: 'litellm_error'

---

File "litellm/litellm_core_utils/streaming_handler.py", line 2181, in __anext__
    self._handle_stream_fallback_error(e)
File "litellm/litellm_core_utils/streaming_handler.py", line 2249, in _handle_stream_fallback_error
    raise MidStreamFallbackError(
        message=str(mapped_exception),
        ...
        is_pre_first_chunk=not self.sent_first_chunk,
    )
File "litellm/exceptions.py", line 957, in __init__
    self.status_code = int(original_status) if original_status is not None else 503
ValueError: invalid literal for int() with base 10: 'litellm_error'

---

# Before (crashes):
self.status_code = int(original_status) if original_status is not None else 503

# After (safe):
try:
    self.status_code = int(original_status) if original_status is not None else 503
except (ValueError, TypeError):
    self.status_code = 503

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Description

The crash is in litellm/exceptions.py:957:

self.status_code = int(original_status) if original_status is not None else 503

This returns HTTP 500 to the client instead of triggering the fallback chain.

Steps to Reproduce

Register a custom provider with a CustomLLM handler
In the handler's astreaming method, raise RateLimitError
Configure fallbacks for the model group
Send a streaming request

Minimal custom handler:

from litellm import CustomLLM, RateLimitError

class MyHandler(CustomLLM):
    async def acompletion(self, *args, **kwargs):
        raise RateLimitError(
            message="Key exhausted",
            model="my-model",
            llm_provider="openai",
        )

    async def astreaming(self, *args, **kwargs):
        # This triggers the bug: RateLimitError propagates through
        # the streaming pipeline and hits MidStreamFallbackError
        await self.acompletion(*args, **kwargs)
        yield  # never reached

config.yaml:

model_list:
  - model_name: my-model
    litellm_params:
      model: my-provider/my-model
  - model_name: fallback-model
    litellm_params:
      model: openai/gpt-4

litellm_settings:
  custom_provider_map:
    - {"provider": "my-provider", "custom_handler": my_handler}
  fallbacks:
    - my-model:
        - fallback-model

Expected Behavior

RateLimitError from a custom handler during streaming should trigger the fallback chain (same as non-streaming).

Actual Behavior

File "litellm/exceptions.py", line 957, in __init__
    self.status_code = int(original_status) if original_status is not None else 503
ValueError: invalid literal for int() with base 10: 'litellm_error'

The client receives HTTP 500 Internal Server Error.

Stack Trace

File "litellm/litellm_core_utils/streaming_handler.py", line 2181, in __anext__
    self._handle_stream_fallback_error(e)
File "litellm/litellm_core_utils/streaming_handler.py", line 2249, in _handle_stream_fallback_error
    raise MidStreamFallbackError(
        message=str(mapped_exception),
        ...
        is_pre_first_chunk=not self.sent_first_chunk,
    )
File "litellm/exceptions.py", line 957, in __init__
    self.status_code = int(original_status) if original_status is not None else 503
ValueError: invalid literal for int() with base 10: 'litellm_error'

Suggested Fix

In litellm/exceptions.py, line 957, handle non-integer original_status gracefully:

# Before (crashes):
self.status_code = int(original_status) if original_status is not None else 503

# After (safe):
try:
    self.status_code = int(original_status) if original_status is not None else 503
except (ValueError, TypeError):
    self.status_code = 503

Workaround

In the custom handler's streaming/astreaming methods, catch RateLimitError and re-raise as litellm.APIError(status_code=429, ...) which has a proper integer status code.

Environment

LiteLLM version: 1.82.3
Python: 3.13
Docker image: ghcr.io/berriai/litellm:main-stable

Steps to Reproduce

Register a custom provider with a CustomLLM handler
In the handler's astreaming method, raise RateLimitError
Configure fallbacks for the model group
Send a streaming request

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.3

Twitter / LinkedIn details

No response

extent analysis

TL;DR

Modify the MidStreamFallbackError.__init__ method in litellm/exceptions.py to handle non-integer original_status values.

Guidance

Identify the source of the non-integer original_status value, which in this case is the string 'litellm_error'.
Apply the suggested fix by wrapping the int(original_status) conversion in a try-except block to catch ValueError and TypeError exceptions.
As a workaround, consider catching RateLimitError in the custom handler's astreaming method and re-raising it as litellm.APIError with a proper integer status code.
Verify the fix by reproducing the steps to trigger the RateLimitError and checking that the fallback chain is triggered correctly.

Example

try:
    self.status_code = int(original_status) if original_status is not None else 503
except (ValueError, TypeError):
    self.status_code = 503

Notes

The provided fix assumes that a non-integer original_status value should result in a default status code of 503. Depending on the specific requirements of the application, this default value may need to be adjusted.

Recommendation

Apply the suggested fix to handle non-integer original_status values, as it provides a more robust solution than the workaround.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #indexing error #inference speed #output truncation #response parsing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: `MidStreamFallbackError` crashes with `ValueError: invalid literal for int() with base 10: 'litellm_error'` [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

PR fix notes

PR #26925: Fix mid-stream fallback status handling

Description (problem / solution / changelog)

Summary

Why

Validation

Changed files

Code Example

Check for existing issues

What happened?

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Stack Trace

Suggested Fix

Workaround

Environment

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING