litellm - 💡(How to fix) Fix Bedrock passthrough requests appear to hang then fail with "Internal server error" when Anthropic is overloaded — overloaded error is never surfaced to the client (e.g. Claude Code) [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

INFO: 10.0.4.94:... - "POST /bedrock/model/bedrock-claude-opus-4-7-eu/invoke-with-response-stream HTTP/1.1" 200 OK

Task exception was never retrieved future: <Task finished name='Task-3813534' coro=<Logging.async_flush_passthrough_collected_chunks() done, defined at .../litellm_logging.py:1891> exception=AnthropicError('Overloaded')> Traceback (most recent call last): File ".../litellm/litellm_core_utils/litellm_logging.py", line 1896, in async_flush_passthrough_collected_chunks complete_streaming_response = self._flush_passthrough_collected_chunks_helper(...) File ".../litellm/litellm_core_utils/litellm_logging.py", line 1859, in _flush_passthrough_collected_chunks_helper complete_streaming_response = provider_config.handle_logging_collected_chunks(...) File ".../litellm/llms/bedrock/passthrough/transformation.py", line 221, in handle_logging_collected_chunks translated_chunk = obj._chunk_parser(chunk_data=message) File ".../litellm/llms/bedrock/chat/invoke_handler.py", line 1772, in _chunk_parser return self.anthropic_model_response_iterator.chunk_parser(chunk=chunk_data) File ".../litellm/llms/anthropic/chat/handler.py", line 851, in chunk_parser raise AnthropicError( litellm.llms.anthropic.common_utils.AnthropicError: Overloaded

Root Cause

Bedrock streaming always opens with HTTP 200. If Anthropic becomes overloaded mid-stream, the error arrives inside the binary event stream — not as an HTTP non-200. After the stream closes, LiteLLM schedules a background task (asyncio.create_task) to flush buffered chunks for spend tracking. That task hits the overloaded error event in the buffered bytes and raises AnthropicError('Overloaded'). Because there is no try/except in async_flush_passthrough_collected_chunks, Python drops it as an unhandled task exception.

Fix Action

Fixed

Code Example

INFO: 10.0.4.94:... - "POST /bedrock/model/bedrock-claude-opus-4-7-eu/invoke-with-response-stream HTTP/1.1" 200 OK

Task exception was never retrieved
future: <Task finished name='Task-3813534' coro=<Logging.async_flush_passthrough_collected_chunks() done, defined at .../litellm_logging.py:1891> exception=AnthropicError('Overloaded')>
Traceback (most recent call last):
  File ".../litellm/litellm_core_utils/litellm_logging.py", line 1896, in async_flush_passthrough_collected_chunks
    complete_streaming_response = self._flush_passthrough_collected_chunks_helper(...)
  File ".../litellm/litellm_core_utils/litellm_logging.py", line 1859, in _flush_passthrough_collected_chunks_helper
    complete_streaming_response = provider_config.handle_logging_collected_chunks(...)
  File ".../litellm/llms/bedrock/passthrough/transformation.py", line 221, in handle_logging_collected_chunks
    translated_chunk = obj._chunk_parser(chunk_data=message)
  File ".../litellm/llms/bedrock/chat/invoke_handler.py", line 1772, in _chunk_parser
    return self.anthropic_model_response_iterator.chunk_parser(chunk=chunk_data)
  File ".../litellm/llms/anthropic/chat/handler.py", line 851, in chunk_parser
    raise AnthropicError(
litellm.llms.anthropic.common_utils.AnthropicError: Overloaded
RAW_BUFFERClick to expand / collapse

When using the Bedrock passthrough endpoint (/bedrock/model/.../invoke-with-response-stream) and Anthropic returns an overloaded error mid-stream, the LiteLLM gateway logs a full traceback as an unhandled asyncio task exception, and the client sees a generic "Internal server error" with no indication the request should be retried.

This leads to users believing there is something wrong with LiteLLM rather than seeing that the bedrock model is overloaded.

Note analysis done with AI.

What is happening

Bedrock streaming always opens with HTTP 200. If Anthropic becomes overloaded mid-stream, the error arrives inside the binary event stream — not as an HTTP non-200. After the stream closes, LiteLLM schedules a background task (asyncio.create_task) to flush buffered chunks for spend tracking. That task hits the overloaded error event in the buffered bytes and raises AnthropicError('Overloaded'). Because there is no try/except in async_flush_passthrough_collected_chunks, Python drops it as an unhandled task exception.

A secondary issue: chunk_parser hardcodes status_code=500 for all in-stream error events regardless of error.type — so even if the exception propagated correctly, the client would receive 500 instead of 529 (overloaded_error).

Observed log output

INFO: 10.0.4.94:... - "POST /bedrock/model/bedrock-claude-opus-4-7-eu/invoke-with-response-stream HTTP/1.1" 200 OK

Task exception was never retrieved
future: <Task finished name='Task-3813534' coro=<Logging.async_flush_passthrough_collected_chunks() done, defined at .../litellm_logging.py:1891> exception=AnthropicError('Overloaded')>
Traceback (most recent call last):
  File ".../litellm/litellm_core_utils/litellm_logging.py", line 1896, in async_flush_passthrough_collected_chunks
    complete_streaming_response = self._flush_passthrough_collected_chunks_helper(...)
  File ".../litellm/litellm_core_utils/litellm_logging.py", line 1859, in _flush_passthrough_collected_chunks_helper
    complete_streaming_response = provider_config.handle_logging_collected_chunks(...)
  File ".../litellm/llms/bedrock/passthrough/transformation.py", line 221, in handle_logging_collected_chunks
    translated_chunk = obj._chunk_parser(chunk_data=message)
  File ".../litellm/llms/bedrock/chat/invoke_handler.py", line 1772, in _chunk_parser
    return self.anthropic_model_response_iterator.chunk_parser(chunk=chunk_data)
  File ".../litellm/llms/anthropic/chat/handler.py", line 851, in chunk_parser
    raise AnthropicError(
litellm.llms.anthropic.common_utils.AnthropicError: Overloaded

Proposed fix

Two small changes:

  1. litellm_core_utils/litellm_logging.py — wrap _flush_passthrough_collected_chunks_helper in a try/except in both async_flush_passthrough_collected_chunks and flush_passthrough_collected_chunks. The flush is a logging/spend-tracking path; an error event in the stream should be caught and logged as a warning, not silently dropped as an unhandled exception.

  2. litellm/llms/anthropic/chat/handler.py — in chunk_parser, use error.type to set the correct status code (529 for overloaded_error, 500 for everything else) rather than hardcoding 500 for all in-stream error events.

Environment

  • LiteLLM version: v1.82.0-stable (confirmed from traceback line numbers; code review confirms issue is present in current main)
  • Model: bedrock-claude-opus-4-7-eu via Bedrock passthrough
  • Endpoint: /bedrock/model/bedrock-claude-opus-4-7-eu/invoke-with-response-stream
  • Called from Claude Code configured for bedrock access (using litellm proxy of course)

Related

  • #24004 (mid-stream fallback not supported for anthropic_messages route — same class of problem)
  • #24609 (no error handling in /v1/messages async_sse_wrapper)
  • PR #26719 introduced the finally block that schedules the flush task (v1.84.0) — this is what made the issue visible

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING