litellm - ✅(Solved) Fix [Bug]: Streaming final chunk drops non-OpenAI attributes (preserve_upstream_non_openai_attributes not called) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#23444Fetched 2026-04-08 00:44:17
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
cross-referenced ×2closed ×1labeled ×1referenced ×1

Root Cause

In litellm/litellm_core_utils/streaming_handler.py, method return_processed_chunk_logic():

  • Content-bearing chunks (~line 917): preserve_upstream_non_openai_attributes() is called — custom attributes are preserved ✅
  • Final chunk (elif self.received_finish_reason is not None, ~line 948): this code path returns model_response without calling preserve_upstream_non_openai_attributes() — custom attributes are dropped ❌

The fix should be a single call to preserve_upstream_non_openai_attributes(model_response, original_chunk) in the received_finish_reason branch, before return model_response at line 981.

Fix Action

Fixed

PR fix notes

PR #23450: fix: preserve non-OpenAI attributes in final streaming chunk

Description (problem / solution / changelog)

Summary

When an upstream server or proxy callback injects custom attributes into SSE streaming chunks, preserve_upstream_non_openai_attributes() correctly copies them to the ModelResponseStream for content-bearing chunks — but not for the final chunk (the one with finish_reason set and empty/null content).

This means any custom metadata attached to the last SSE chunk by server-side callbacks or middleware is silently dropped by the SDK.

Root Cause

In litellm/litellm_core_utils/streaming_handler.py, method return_processed_chunk_logic():

  • Content-bearing chunks (line ~963): preserve_upstream_non_openai_attributes() is called — custom attributes are preserved ✅
  • Final chunk (elif self.received_finish_reason is not None, line ~996): this code path returns model_response without calling preserve_upstream_non_openai_attributes() — custom attributes are dropped ❌

Fix

Added a call to preserve_upstream_non_openai_attributes(model_response, original_chunk) in the received_finish_reason branch, before returning model_response.

Testing

The fix follows the same pattern as the existing implementation for content-bearing chunks, ensuring consistency.

Closes #23444

Changed files

  • litellm/litellm_core_utils/streaming_handler.py (modified, +8/-0)
  • litellm/llms/anthropic/chat/transformation.py (modified, +5/-0)

Code Example

async def async_post_call_streaming_iterator_hook(self, user_api_key_dict, response, request_data):
    async for chunk in response:
        if chunk.choices and chunk.choices[0].finish_reason:
            chunk.custom_field = {"key": "value"}
        yield chunk

---

response = litellm.completion(model="gpt-4", messages=[...], stream=True)
for chunk in response:
    data = chunk.model_dump()
    if "custom_field" in data:
        print("Found custom_field")  # Never prints
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When an upstream server or proxy callback injects custom attributes into SSE streaming chunks, preserve_upstream_non_openai_attributes() correctly copies them to the ModelResponseStream for content-bearing chunks — but not for the final chunk (the one with finish_reason set and empty/null content).

This means any custom metadata attached to the last SSE chunk by server-side callbacks or middleware is silently dropped by the SDK. The client never sees it.

Root cause

In litellm/litellm_core_utils/streaming_handler.py, method return_processed_chunk_logic():

  • Content-bearing chunks (~line 917): preserve_upstream_non_openai_attributes() is called — custom attributes are preserved ✅
  • Final chunk (elif self.received_finish_reason is not None, ~line 948): this code path returns model_response without calling preserve_upstream_non_openai_attributes() — custom attributes are dropped ❌

The fix should be a single call to preserve_upstream_non_openai_attributes(model_response, original_chunk) in the received_finish_reason branch, before return model_response at line 981.

Steps to Reproduce

  1. Set up a proxy callback that injects a custom field into the last SSE chunk:
async def async_post_call_streaming_iterator_hook(self, user_api_key_dict, response, request_data):
    async for chunk in response:
        if chunk.choices and chunk.choices[0].finish_reason:
            chunk.custom_field = {"key": "value"}
        yield chunk
  1. On the client, consume the stream and check for the custom field:
response = litellm.completion(model="gpt-4", messages=[...], stream=True)
for chunk in response:
    data = chunk.model_dump()
    if "custom_field" in data:
        print("Found custom_field")  # Never prints
  1. The same request via raw HTTP (parsing SSE lines directly) does show custom_field in the last chunk's JSON.

Expected behavior

preserve_upstream_non_openai_attributes() should be called for all chunks that are returned to the client, including the final chunk with finish_reason. The function already exists and works correctly — it just needs to be called in the received_finish_reason branch.

What part of LiteLLM is this about?

SDK, Proxy

What LiteLLM version are you on?

v1.80.10 (also verified on v1.82.x — same code path)

extent analysis

Fix Plan

To fix the issue, we need to add a call to preserve_upstream_non_openai_attributes() in the received_finish_reason branch. Here are the steps:

  • Open litellm/litellm_core_utils/streaming_handler.py
  • Locate the return_processed_chunk_logic() method
  • In the elif self.received_finish_reason is not None branch, add the following line before return model_response:
preserve_upstream_non_openai_attributes(model_response, original_chunk)

The corrected code should look like this:

elif self.received_finish_reason is not None:
    # ...
    preserve_upstream_non_openai_attributes(model_response, original_chunk)
    return model_response

Verification

To verify the fix, repeat the steps to reproduce the issue:

  1. Set up a proxy callback that injects a custom field into the last SSE chunk:
async def async_post_call_streaming_iterator_hook(self, user_api_key_dict, response, request_data):
    async for chunk in response:
        if chunk.choices and chunk.choices[0].finish_reason:
            chunk.custom_field = {"key": "value"}
        yield chunk
  1. On the client, consume the stream and check for the custom field:
response = litellm.completion(model="gpt-4", messages=[...], stream=True)
for chunk in response:
    data = chunk.model_dump()
    if "custom_field" in data:
        print("Found custom_field")  # Should print now

The custom field should now be present in the last chunk's JSON.

Extra Tips

  • Make sure to test the fix with different scenarios, including various types of custom attributes and chunk contents.
  • Consider adding additional logging or debugging statements to ensure the preserve_upstream_non_openai_attributes() function is being called correctly.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

preserve_upstream_non_openai_attributes() should be called for all chunks that are returned to the client, including the final chunk with finish_reason. The function already exists and works correctly — it just needs to be called in the received_finish_reason branch.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING