litellm - ✅(Solved) Fix [Bug]: Snowflake streaming responses with tool calls not parsed correctly [1 pull requests, 1 participants]

litellm2026-03-10 18:41:15

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#23286•Fetched 2026-04-08 00:37:39

View on GitHub

Comments

Participants

Timeline

Reactions

Author

stevejaker

Participants

stevejaker

Timeline (top)

labeled ×3cross-referenced ×1

Fix Action

Fixed

Fixed by PR: fix(snowflake): add streaming handler for tool calls (https://github.com/BerriAI/litellm/pull/23207)

PR fix notes

PR #23207: fix(snowflake): add streaming handler for tool calls

Repository: BerriAI/litellm
Author: stevejaker
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/23207

Description (problem / solution / changelog)

Snowflake uses an Anthropic-style streaming format with content_block events for tool calls, but was inheriting OpenAI's streaming handler which expects a different format. This caused streaming tool calls to fail.

This commit adds:

SnowflakeStreamingHandler class that transforms content_block events to OpenAI-compatible streaming chunks
get_model_response_iterator() override in SnowflakeConfig to use the custom handler
Comprehensive unit tests for streaming scenarios

The handler supports:

content_block_start/delta/stop events for text and tool_use
message_start/delta events for usage and finish_reason
Passthrough for already OpenAI-compatible chunks
Multiple sequential tool calls with proper indexing

Relevant issues

Fixes https://github.com/BerriAI/litellm/issues/23286

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run Link:
CI run for the last commit Link:
Merge / cherry-pick CI run Links:

Type

🐛 Bug Fix

Changes

Added SnowflakeStreamingHandler class in litellm/llms/snowflake/chat/transformation.py
Added get_model_response_iterator() method to SnowflakeConfig
Added 10 new unit tests in tests/test_litellm/llms/snowflake/chat/test_snowflake_chat_transformation.py

Tests

All 22 Snowflake tests pass (poetry run pytest tests/test_litellm/llms/snowflake/ -v)
Linting passes on changed files (poetry run black --check and poetry run ruff check)
MyPy passes on changed files

Changed files

litellm/llms/snowflake/chat/transformation.py (modified, +255/-3)
tests/test_litellm/llms/snowflake/chat/test_snowflake_chat_transformation.py (modified, +341/-3)

Code Example

import litellm

tools = [{
	"type": "function",
	"function": {
		"name": "get_weather",
		"description": "Get the current weather",
		"parameters": {
			"type": "object",
			"properties": {
				"location": {"type": "string"}
			},
			"required": ["location"]
		}
	}
}]

response = litellm.completion(
	model="snowflake/claude-3-5-sonnet",
	messages=[{"role": "user", "content": "What's the weather in Paris?"}],
	tools=tools,
	tool_choice={"type": "auto"},
	stream=True,
	api_key="pat/your-token",
	account_id="your-account-id",
)

# Iterate through streaming chunks
for chunk in response:
	print(chunk)
	# Tool calls in delta are not properly parsed
	if hasattr(chunk.choices[0].delta, 'tool_calls'):
		print(f"Tool calls: {chunk.choices[0].delta.tool_calls}")

---

Snowflake sends streaming events like:
{"type": "content_block_start", "content_block": {"type": "tool_use", "id": "tooluse_abc123", "name": "get_weather"}}
{"type": "content_block_delta", "delta": {"type": "input_json_delta", "partial_json": "{\"location\": \""}}
{"type": "content_block_delta", "delta": {"type": "input_json_delta", "partial_json": "Paris\"}"}}
{"type": "content_block_stop"}

But the default handler expects OpenAI format:
{"choices": [{"delta": {"tool_calls": [{"index": 0, "id": "...", "function": {"name": "...", "arguments": "..."}}]}}]}

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When using Snowflake Cortex LLM API with stream=True and tools, streaming tool call responses are not correctly parsed. Snowflake streams tool calls using content_block_start, content_block_delta, and content_block_stop events, but LiteLLM's default OpenAI streaming handler expects delta.tool_calls format.

This causes streaming tool calls to either fail or return malformed data.

Expected: Streaming responses with tool calls should be correctly transformed from Snowflake's content_block format to OpenAI's delta.tool_calls format.

Steps to Reproduce

import litellm

tools = [{
	"type": "function",
	"function": {
		"name": "get_weather",
		"description": "Get the current weather",
		"parameters": {
			"type": "object",
			"properties": {
				"location": {"type": "string"}
			},
			"required": ["location"]
		}
	}
}]

response = litellm.completion(
	model="snowflake/claude-3-5-sonnet",
	messages=[{"role": "user", "content": "What's the weather in Paris?"}],
	tools=tools,
	tool_choice={"type": "auto"},
	stream=True,
	api_key="pat/your-token",
	account_id="your-account-id",
)

# Iterate through streaming chunks
for chunk in response:
	print(chunk)
	# Tool calls in delta are not properly parsed
	if hasattr(chunk.choices[0].delta, 'tool_calls'):
		print(f"Tool calls: {chunk.choices[0].delta.tool_calls}")

Relevant log output

Snowflake sends streaming events like:
{"type": "content_block_start", "content_block": {"type": "tool_use", "id": "tooluse_abc123", "name": "get_weather"}}
{"type": "content_block_delta", "delta": {"type": "input_json_delta", "partial_json": "{\"location\": \""}}
{"type": "content_block_delta", "delta": {"type": "input_json_delta", "partial_json": "Paris\"}"}}
{"type": "content_block_stop"}

But the default handler expects OpenAI format:
{"choices": [{"delta": {"tool_calls": [{"index": 0, "id": "...", "function": {"name": "...", "arguments": "..."}}]}}]}

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

v1.63.2

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue, we need to create a custom streaming handler that transforms Snowflake's content block format to OpenAI's delta.tool_calls format.

Step 1: Create a Custom Streaming Handler

Create a new class that inherits from litellm.StreamingCompletionHandler and override the handle_chunk method to parse Snowflake's content block events.

import litellm
import json

class SnowflakeStreamingHandler(litellm.StreamingCompletionHandler):
    def __init__(self):
        self.tool_calls = []
        self.current_tool_call = None

    def handle_chunk(self, chunk):
        if chunk["type"] == "content_block_start" and chunk["content_block"]["type"] == "tool_use":
            self.current_tool_call = {
                "index": 0,
                "id": chunk["content_block"]["id"],
                "function": {
                    "name": chunk["content_block"]["name"],
                    "arguments": {}
                }
            }
        elif chunk["type"] == "content_block_delta" and chunk["delta"]["type"] == "input_json_delta":
            partial_json = json.loads(chunk["delta"]["partial_json"])
            self.current_tool_call["function"]["arguments"] = partial_json
        elif chunk["type"] == "content_block_stop":
            self.tool_calls.append(self.current_tool_call)
            self.current_tool_call = None
        return {
            "choices": [{
                "delta": {
                    "tool_calls": self.tool_calls
                }
            }]
        }

Step 2: Use the Custom Streaming Handler

Pass the custom streaming handler to the litellm.completion function using the streaming_handler parameter.

response = litellm.completion(
    model="snowflake/claude-3-5-sonnet",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    tool_choice={"type": "auto"},
    stream=True,
    streaming_handler=SnowflakeStreamingHandler(),
    api_key="pat/your-token",
    account_id="your-account-id",
)

# Iterate through streaming chunks
for chunk in response:
    print(chunk)
    if hasattr(chunk.choices[0].delta, 'tool_calls'):
        print(f"Tool calls: {chunk.choices[0].delta.tool_calls}")

Verification

Verify that the fix worked by checking the output of the print statements. The tool_calls attribute should now be properly parsed and contain the expected data.

Extra Tips

Make sure to handle any potential errors that may occur during the parsing process.
Consider adding additional logging or debugging statements to help diagnose any issues that may arise.
If you're using a different version of the LiteLLM SDK, you may need to modify the custom streaming handler to accommodate any changes to the API.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #model save/load #optimization

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.