litellm - 💡(How to fix) Fix [Bug]: `stream_chunk_builder` corrupts Gemini `server_side_tool_invocations` and `thought_signatures` in streaming, causing "Corrupted tool call context" on follow-up turns [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#25869Fetched 2026-04-17 08:28:27
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×3

Error Message

import litellm from litellm import acompletion

model = "gemini/gemini-3-flash-preview"

def simple_add(a: int, b: int) -> int: "Add two numbers together" return a + b

tool_schema = {'type': 'function', 'function': { 'name': 'simple_add', 'description': 'Add two numbers', 'parameters': {'type': 'object', 'properties': { 'a': {'type': 'integer'}, 'b': {'type': 'integer'} }, 'required': ['a', 'b']} }}

msgs = [{'role': 'user', 'content': "Search the web for Brisbane's population and also use simple_add to add 1+1."}]

First turn - streaming with search + tools

chunks = [] r = await acompletion(model=model, messages=msgs, stream=True, tools=[tool_schema], web_search_options={"search_context_size": "low"}, include_server_side_tool_invocations=True, stream_options={"include_usage": True}) async for chunk in r: chunks.append(chunk)

Rebuild response from chunks

result = litellm.stream_chunk_builder(chunks) assistant_msg = result.choices[0].message

Add tool result

msgs.append(assistant_msg) tc = [t for t in assistant_msg.tool_calls if not t.id.startswith('srvtoolu_')][0] msgs.append({'role': 'tool', 'tool_call_id': tc.id, 'name': 'simple_add', 'content': '2'})

Second turn fails

r2 = await acompletion(model=model, messages=msgs, tools=[tool_schema], web_search_options={"search_context_size": "low"}, include_server_side_tool_invocations=True)

=> BadRequestError: "contents[1].parts[1]: Corrupted tool call context."

Root Cause

Gemini streams server_side_tool_invocations across multiple chunks:

  • Chunk 1 contains toolCall with args and a correct thought_signature (~600 bytes)
  • Chunk 2 contains toolResponse with response and a bloated thought_signature (~110KB, appears to contain the search HTML response)

In the non-streaming path, _extract_server_side_tool_invocations correctly handles this by merging toolCall and toolResponse by id, keeping the toolCall's thought_signature and only using the toolResponse's if the call doesn't already have one (lines 1411-1419 of vertex_and_google_ai_studio_gemini.py).

However, stream_chunk_builder (line ~7604 of main.py) simply takes the last list value for all list-type provider_specific_fields:

# For lists like web_search_results, take the last (most complete) one
combined_provider_fields[key] = value

This overwrites the correct server_side_tool_invocations entry (with args + correct signature) with the later one (missing args, bloated signature). When this corrupted data is sent back to Gemini on the next turn, it rejects it.

The same issue affects thought_signatures — they arrive across multiple chunks but only the last chunk's value is kept, losing earlier signatures.

Code Example

# For lists like web_search_results, take the last (most complete) one
combined_provider_fields[key] = value

---

for key, value in fields.items():
    if key not in combined_provider_fields:
        combined_provider_fields[key] = value
    elif key == 'server_side_tool_invocations' and isinstance(value, list):
        # Merge by id, first-wins per key (matches _extract_server_side_tool_invocations behavior)
        existing = {e.get('id'): e for e in combined_provider_fields[key]}
        for inv in value:
            inv_id = inv.get('id', '')
            if inv_id in existing:
                for k, v in inv.items():
                    if v is not None and k not in existing[inv_id]:
                        existing[inv_id][k] = v
            else:
                existing[inv_id] = dict(inv)
        combined_provider_fields[key] = list(existing.values())
    elif key == 'thought_signatures' and isinstance(value, list):
        # Accumulate all thought signatures across chunks
        combined_provider_fields[key] = combined_provider_fields[key] + value
    elif isinstance(value, list) and isinstance(
        combined_provider_fields[key], list
    ):
        combined_provider_fields[key] = value
    else:
        combined_provider_fields[key] = value

---

import litellm
from litellm import acompletion

model = "gemini/gemini-3-flash-preview"

def simple_add(a: int, b: int) -> int:
    "Add two numbers together"
    return a + b

tool_schema = {'type': 'function', 'function': {
    'name': 'simple_add', 'description': 'Add two numbers',
    'parameters': {'type': 'object', 'properties': {
        'a': {'type': 'integer'}, 'b': {'type': 'integer'}
    }, 'required': ['a', 'b']}
}}

msgs = [{'role': 'user', 'content': "Search the web for Brisbane's population and also use simple_add to add 1+1."}]

# First turn - streaming with search + tools
chunks = []
r = await acompletion(model=model, messages=msgs, stream=True,
    tools=[tool_schema], web_search_options={"search_context_size": "low"},
    include_server_side_tool_invocations=True, stream_options={"include_usage": True})
async for chunk in r: chunks.append(chunk)

# Rebuild response from chunks
result = litellm.stream_chunk_builder(chunks)
assistant_msg = result.choices[0].message

# Add tool result
msgs.append(assistant_msg)
tc = [t for t in assistant_msg.tool_calls if not t.id.startswith('srvtoolu_')][0]
msgs.append({'role': 'tool', 'tool_call_id': tc.id, 'name': 'simple_add', 'content': '2'})

# Second turn fails
r2 = await acompletion(model=model, messages=msgs, tools=[tool_schema],
    web_search_options={"search_context_size": "low"},
    include_server_side_tool_invocations=True)
# => BadRequestError: "contents[1].parts[1]: Corrupted tool call context."

---
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When using Gemini models with streaming + web search + tool calls, follow-up turns fail with 400 Bad Request: "Corrupted tool call context". The root cause is in stream_chunk_builder — it incorrectly merges provider_specific_fields across streaming chunks.

Root Cause

Gemini streams server_side_tool_invocations across multiple chunks:

  • Chunk 1 contains toolCall with args and a correct thought_signature (~600 bytes)
  • Chunk 2 contains toolResponse with response and a bloated thought_signature (~110KB, appears to contain the search HTML response)

In the non-streaming path, _extract_server_side_tool_invocations correctly handles this by merging toolCall and toolResponse by id, keeping the toolCall's thought_signature and only using the toolResponse's if the call doesn't already have one (lines 1411-1419 of vertex_and_google_ai_studio_gemini.py).

However, stream_chunk_builder (line ~7604 of main.py) simply takes the last list value for all list-type provider_specific_fields:

# For lists like web_search_results, take the last (most complete) one
combined_provider_fields[key] = value

This overwrites the correct server_side_tool_invocations entry (with args + correct signature) with the later one (missing args, bloated signature). When this corrupted data is sent back to Gemini on the next turn, it rejects it.

The same issue affects thought_signatures — they arrive across multiple chunks but only the last chunk's value is kept, losing earlier signatures.

Suggested Fix

In stream_chunk_builder (around line 7604 of main.py), replace the naive last-wins merge with proper handling for server_side_tool_invocations and thought_signatures:

for key, value in fields.items():
    if key not in combined_provider_fields:
        combined_provider_fields[key] = value
    elif key == 'server_side_tool_invocations' and isinstance(value, list):
        # Merge by id, first-wins per key (matches _extract_server_side_tool_invocations behavior)
        existing = {e.get('id'): e for e in combined_provider_fields[key]}
        for inv in value:
            inv_id = inv.get('id', '')
            if inv_id in existing:
                for k, v in inv.items():
                    if v is not None and k not in existing[inv_id]:
                        existing[inv_id][k] = v
            else:
                existing[inv_id] = dict(inv)
        combined_provider_fields[key] = list(existing.values())
    elif key == 'thought_signatures' and isinstance(value, list):
        # Accumulate all thought signatures across chunks
        combined_provider_fields[key] = combined_provider_fields[key] + value
    elif isinstance(value, list) and isinstance(
        combined_provider_fields[key], list
    ):
        combined_provider_fields[key] = value
    else:
        combined_provider_fields[key] = value

Environment

  • litellm version: 1.83.0 (but affects all recent versions with Gemini 3 streaming support)
  • Python 3.12
  • Model: gemini/gemini-3-flash-preview (also affects gemini-3-pro-preview)

Steps to Reproduce

import litellm
from litellm import acompletion

model = "gemini/gemini-3-flash-preview"

def simple_add(a: int, b: int) -> int:
    "Add two numbers together"
    return a + b

tool_schema = {'type': 'function', 'function': {
    'name': 'simple_add', 'description': 'Add two numbers',
    'parameters': {'type': 'object', 'properties': {
        'a': {'type': 'integer'}, 'b': {'type': 'integer'}
    }, 'required': ['a', 'b']}
}}

msgs = [{'role': 'user', 'content': "Search the web for Brisbane's population and also use simple_add to add 1+1."}]

# First turn - streaming with search + tools
chunks = []
r = await acompletion(model=model, messages=msgs, stream=True,
    tools=[tool_schema], web_search_options={"search_context_size": "low"},
    include_server_side_tool_invocations=True, stream_options={"include_usage": True})
async for chunk in r: chunks.append(chunk)

# Rebuild response from chunks
result = litellm.stream_chunk_builder(chunks)
assistant_msg = result.choices[0].message

# Add tool result
msgs.append(assistant_msg)
tc = [t for t in assistant_msg.tool_calls if not t.id.startswith('srvtoolu_')][0]
msgs.append({'role': 'tool', 'tool_call_id': tc.id, 'name': 'simple_add', 'content': '2'})

# Second turn fails
r2 = await acompletion(model=model, messages=msgs, tools=[tool_schema],
    web_search_options={"search_context_size": "low"},
    include_server_side_tool_invocations=True)
# => BadRequestError: "contents[1].parts[1]: Corrupted tool call context."

The same sequence works with stream=False on the first turn.

Relevant log output

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

1.83.0

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The most likely fix is to update the stream_chunk_builder function in main.py to properly merge server_side_tool_invocations and thought_signatures across streaming chunks.

Guidance

  • Review the provided code snippet for stream_chunk_builder and apply the suggested fix to handle server_side_tool_invocations and thought_signatures correctly.
  • Verify that the fix resolves the 400 Bad Request: "Corrupted tool call context" error by re-running the steps to reproduce with the updated code.
  • Ensure that the stream_chunk_builder function is correctly merging the provider_specific_fields across chunks, especially for server_side_tool_invocations and thought_signatures.
  • Test the fix with different input scenarios to confirm that it works as expected.

Example

The provided code snippet for stream_chunk_builder demonstrates the correct way to merge server_side_tool_invocations and thought_signatures:

for key, value in fields.items():
    if key not in combined_provider_fields:
        combined_provider_fields[key] = value
    elif key == 'server_side_tool_invocations' and isinstance(value, list):
        # Merge by id, first-wins per key (matches _extract_server_side_tool_invocations behavior)
        existing = {e.get('id'): e for e in combined_provider_fields[key]}
        for inv in value:
            inv_id = inv.get('id', '')
            if inv_id in existing:
                for k, v in inv.items():
                    if v is not None and k not in existing[inv_id]:
                        existing[inv_id][k] = v
            else:
                existing[inv_id] = dict(inv)
        combined_provider_fields[key] = list(existing.values())
    # ...

Notes

The fix assumes that the stream_chunk_builder function is the root cause of the issue. If the problem persists after applying the fix, further investigation may be necessary to identify the underlying cause.

Recommendation

Apply the suggested fix to the stream_chunk_builder function to resolve the 400 Bad Request: "Corrupted tool call context" error. This fix should resolve the issue and allow the streaming functionality to work correctly.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING