LiteLLM should not fail when a Vertex Gemini streaming chunk has no choices. Possible expected behavior: - skip empty-choice metadata-only chunks, or - guard `raise_on_model_repetition()` against chunks with empty `choices`, or - ensure `ModelResponseIterator.chunk_parser()` returns `None` instead of `ModelResponseStream(choices=[])` when the chunk has no streamable delta.

litellm - 💡(How to fix) Fix [Bug]: Vertex Gemini web search streaming crashes on 3/3.1 Flash / Flash Lite models with empty choices chunk [1 pull requests]

Code Example

litellm.MidStreamFallbackError: litellm.APIConnectionError: list index out of range

---

vertex_ai/gemini-3.1-flash-lite-preview
vertex_ai/gemini-3.1-flash-lite
vertex_ai/gemini-3-flash-preview

---

vertex_ai/gemini-3.1-pro-preview

---

litellm.main.acompletion()
  -> litellm.main.completion()
  -> litellm.utils.get_optional_params()
  -> VertexGeminiConfig.map_openai_params()
  -> VertexGeminiConfig._map_web_search_options()
  -> VertexLLM.completion()
  -> VertexLLM.async_streaming()
  -> make_call()
  -> ModelResponseIterator.chunk_parser()
  -> CustomStreamWrapper.chunk_creator()
  -> CustomStreamWrapper.return_processed_chunk_logic()
  -> CustomStreamWrapper.raise_on_model_repetition()

---

VertexGeminiConfig._map_web_search_options()

---

model_response = ModelResponseStream(choices=[], id=response_id)

---

last_content = self.chunks[-1].choices[0].delta.content
second_to_last_content = self.chunks[-2].choices[0].delta.content

---

IndexError: list index out of range

---

MidStreamFallbackError -> APIConnectionError -> IndexError: list index out of range

---

LiteLLM version: 1.83.14
Python: 3.12
Provider: Vertex AI
Endpoint style: Vertex AI streaming / streamGenerateContent

Failing models:
  - vertex_ai/gemini-3.1-flash-lite-preview
  - vertex_ai/gemini-3.1-flash-lite
  - vertex_ai/gemini-3-flash-preview

Working comparison model:
  - vertex_ai/gemini-3.1-pro-preview

---

last_content = self.chunks[-1].choices[0].delta.content
second_to_last_content = self.chunks[-2].choices[0].delta.content

---

### Steps to Reproduce

---

### Relevant log output

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

What happened

When using Vertex AI Gemini with stream=True and web_search_options={}, LiteLLM fails mid-stream with:

litellm.MidStreamFallbackError: litellm.APIConnectionError: list index out of range

Confirmed failing models:

vertex_ai/gemini-3.1-flash-lite-preview
vertex_ai/gemini-3.1-flash-lite
vertex_ai/gemini-3-flash-preview

The same request pattern worked with:

vertex_ai/gemini-3.1-pro-preview

So this appears to affect Gemini Flash / Flash Lite streaming response shapes with web search, rather than all Gemini 3.x models.

Root cause hypothesis

The failure appears to happen inside LiteLLM's streaming pipeline.

Call flow:

litellm.main.acompletion()
  -> litellm.main.completion()
  -> litellm.utils.get_optional_params()
  -> VertexGeminiConfig.map_openai_params()
  -> VertexGeminiConfig._map_web_search_options()
  -> VertexLLM.completion()
  -> VertexLLM.async_streaming()
  -> make_call()
  -> ModelResponseIterator.chunk_parser()
  -> CustomStreamWrapper.chunk_creator()
  -> CustomStreamWrapper.return_processed_chunk_logic()
  -> CustomStreamWrapper.raise_on_model_repetition()

web_search_options={} is converted by:

VertexGeminiConfig._map_web_search_options()

into a Gemini googleSearch tool.

During streaming, ModelResponseIterator.chunk_parser() creates a stream response object with empty choices first:

model_response = ModelResponseStream(choices=[], id=response_id)

and only populates choices when candidate content is present.

For Gemini Flash / Flash Lite models with web search streaming, some metadata-only chunks appear to produce a ModelResponseStream with empty choices.

Later, CustomStreamWrapper.raise_on_model_repetition() assumes choices are non-empty:

last_content = self.chunks[-1].choices[0].delta.content
second_to_last_content = self.chunks[-2].choices[0].delta.content

This raises:

IndexError: list index out of range

LiteLLM then wraps it as APIConnectionError, then MidStreamFallbackError.

Expected behavior

LiteLLM should not fail when a Vertex Gemini streaming chunk has no choices.

Possible expected behavior:

skip empty-choice metadata-only chunks, or
guard raise_on_model_repetition() against chunks with empty choices, or
ensure ModelResponseIterator.chunk_parser() returns None instead of ModelResponseStream(choices=[]) when the chunk has no streamable delta.

Actual behavior

LiteLLM raises mid-stream:

MidStreamFallbackError -> APIConnectionError -> IndexError: list index out of range

Environment

LiteLLM version: 1.83.14
Python: 3.12
Provider: Vertex AI
Endpoint style: Vertex AI streaming / streamGenerateContent

Failing models:
  - vertex_ai/gemini-3.1-flash-lite-preview
  - vertex_ai/gemini-3.1-flash-lite
  - vertex_ai/gemini-3-flash-preview

Working comparison model:
  - vertex_ai/gemini-3.1-pro-preview

I also checked current LiteLLM main around version 1.85.0, and the relevant code still appears to assume non-empty choices in raise_on_model_repetition():

last_content = self.chunks[-1].choices[0].delta.content
second_to_last_content = self.chunks[-2].choices[0].delta.content

Related issues

Possibly related, but not the same failure:


### Steps to Reproduce

```python
async def main() -> None:
    vertex_credentials = os.environ.get("GOOGLE_VERTEX_AI_CREDENTIALS")
    vertex_project = os.environ.get("GOOGLE_VERTEX_AI_PROJECT")

    for model in MODELS:
        print(f"\nTesting {model}")
        response = await acompletion(
            model=model,
            messages=[
                {
                    "role": "user",
                    "content": "Search the web and summarize the latest news about OpenAI.",
                }
            ],
            stream=True,
            web_search_options={},
            stream_options={"include_usage": True},
            vertex_credentials=vertex_credentials,
            vertex_project=vertex_project,
            vertex_location="global",
        )

        async for chunk in response:
            print(chunk)


if __name__ == "__main__":
    asyncio.run(main())

Relevant log output

File ".../litellm/litellm_core_utils/streaming_handler.py", line 1587, in chunk_creator
  return self.return_processed_chunk_logic(...)

File ".../litellm/litellm_core_utils/streaming_handler.py", line 960, in return_processed_chunk_logic
  self.raise_on_model_repetition()

File ".../litellm/litellm_core_utils/streaming_handler.py", line 272, in raise_on_model_repetition
  last_content = self.chunks[-1].choices[0].delta.content
                 ~~~~~~~~~~~~~~~~~~~~~~~^^^

IndexError: list index out of range

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

1.83.14

Twitter / LinkedIn details

@secchinman

FAQ

Expected behavior

LiteLLM should not fail when a Vertex Gemini streaming chunk has no choices.

Possible expected behavior:

skip empty-choice metadata-only chunks, or
guard raise_on_model_repetition() against chunks with empty choices, or
ensure ModelResponseIterator.chunk_parser() returns None instead of ModelResponseStream(choices=[]) when the chunk has no streamable delta.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix [Bug]: Vertex Gemini web search streaming crashes on 3/3.1 Flash / Flash Lite models with empty choices chunk [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause hypothesis

Fix Action

Fixed

Code Example

Check for existing issues

What happened?

What happened

Root cause hypothesis

Expected behavior

Actual behavior

Environment

Related issues

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

FAQ

Expected behavior

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix [Bug]: Vertex Gemini web search streaming crashes on 3/3.1 Flash / Flash Lite models with empty choices chunk [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause hypothesis

Fix Action

Fixed

Code Example

Check for existing issues

What happened?

What happened

Root cause hypothesis

Expected behavior

Actual behavior

Environment

Related issues

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING