litellm - ✅(Solved) Fix [Bug]: Streaming final chunk drops non-OpenAI attributes (preserve_upstream_non_openai_attributes not called) [1 pull requests, 1 participants]

Q: Expected behavior

`preserve_upstream_non_openai_attributes()` should be called for **all** chunks that are returned to the client, including the final chunk with `finish_reason`. The function already exists and works correctly — it just needs to be called in the `received_finish_reason` branch.

litellm2026-03-12 12:18:31

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#23444•Fetched 2026-04-08 00:44:17

View on GitHub

Comments

Participants

Timeline

Reactions

Author

voginskis-clgx

Participants

voginskis-clgx

Timeline (top)

cross-referenced ×2closed ×1labeled ×1referenced ×1

Root Cause

In litellm/litellm_core_utils/streaming_handler.py, method return_processed_chunk_logic():

Content-bearing chunks (~line 917): preserve_upstream_non_openai_attributes() is called — custom attributes are preserved ✅
Final chunk (elif self.received_finish_reason is not None, ~line 948): this code path returns model_response without calling preserve_upstream_non_openai_attributes() — custom attributes are dropped ❌

The fix should be a single call to preserve_upstream_non_openai_attributes(model_response, original_chunk) in the received_finish_reason branch, before return model_response at line 981.

Fix Action

Fixed

Fixed by PR: fix: preserve non-OpenAI attributes in final streaming chunk (https://github.com/BerriAI/litellm/pull/23450)

PR fix notes

PR #23450: fix: preserve non-OpenAI attributes in final streaming chunk

Repository: BerriAI/litellm
Author: Jah-yee
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/23450

Description (problem / solution / changelog)

Summary

When an upstream server or proxy callback injects custom attributes into SSE streaming chunks, preserve_upstream_non_openai_attributes() correctly copies them to the ModelResponseStream for content-bearing chunks — but not for the final chunk (the one with finish_reason set and empty/null content).

This means any custom metadata attached to the last SSE chunk by server-side callbacks or middleware is silently dropped by the SDK.

Root Cause

In litellm/litellm_core_utils/streaming_handler.py, method return_processed_chunk_logic():

Content-bearing chunks (line ~963): preserve_upstream_non_openai_attributes() is called — custom attributes are preserved ✅
Final chunk (elif self.received_finish_reason is not None, line ~996): this code path returns model_response without calling preserve_upstream_non_openai_attributes() — custom attributes are dropped ❌

Fix

Added a call to preserve_upstream_non_openai_attributes(model_response, original_chunk) in the received_finish_reason branch, before returning model_response.

Testing

The fix follows the same pattern as the existing implementation for content-bearing chunks, ensuring consistency.

Closes #23444

Changed files

litellm/litellm_core_utils/streaming_handler.py (modified, +8/-0)
litellm/llms/anthropic/chat/transformation.py (modified, +5/-0)

Code Example

async def async_post_call_streaming_iterator_hook(self, user_api_key_dict, response, request_data):
    async for chunk in response:
        if chunk.choices and chunk.choices[0].finish_reason:
            chunk.custom_field = {"key": "value"}
        yield chunk

---

response = litellm.completion(model="gpt-4", messages=[...], stream=True)
for chunk in response:
    data = chunk.model_dump()
    if "custom_field" in data:
        print("Found custom_field")  # Never prints

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

This means any custom metadata attached to the last SSE chunk by server-side callbacks or middleware is silently dropped by the SDK. The client never sees it.

Root cause

In litellm/litellm_core_utils/streaming_handler.py, method return_processed_chunk_logic():

Content-bearing chunks (~line 917): preserve_upstream_non_openai_attributes() is called — custom attributes are preserved ✅
Final chunk (elif self.received_finish_reason is not None, ~line 948): this code path returns model_response without calling preserve_upstream_non_openai_attributes() — custom attributes are dropped ❌

The fix should be a single call to preserve_upstream_non_openai_attributes(model_response, original_chunk) in the received_finish_reason branch, before return model_response at line 981.

Steps to Reproduce

Set up a proxy callback that injects a custom field into the last SSE chunk:

async def async_post_call_streaming_iterator_hook(self, user_api_key_dict, response, request_data):
    async for chunk in response:
        if chunk.choices and chunk.choices[0].finish_reason:
            chunk.custom_field = {"key": "value"}
        yield chunk

On the client, consume the stream and check for the custom field:

response = litellm.completion(model="gpt-4", messages=[...], stream=True)
for chunk in response:
    data = chunk.model_dump()
    if "custom_field" in data:
        print("Found custom_field")  # Never prints

The same request via raw HTTP (parsing SSE lines directly) does show custom_field in the last chunk's JSON.

Expected behavior

preserve_upstream_non_openai_attributes() should be called for all chunks that are returned to the client, including the final chunk with finish_reason. The function already exists and works correctly — it just needs to be called in the received_finish_reason branch.

What part of LiteLLM is this about?

SDK, Proxy

What LiteLLM version are you on?

v1.80.10 (also verified on v1.82.x — same code path)

extent analysis

Fix Plan

To fix the issue, we need to add a call to preserve_upstream_non_openai_attributes() in the received_finish_reason branch. Here are the steps:

Open litellm/litellm_core_utils/streaming_handler.py
Locate the return_processed_chunk_logic() method
In the elif self.received_finish_reason is not None branch, add the following line before return model_response:

preserve_upstream_non_openai_attributes(model_response, original_chunk)

The corrected code should look like this:

elif self.received_finish_reason is not None:
    # ...
    preserve_upstream_non_openai_attributes(model_response, original_chunk)
    return model_response

Verification

To verify the fix, repeat the steps to reproduce the issue:

Set up a proxy callback that injects a custom field into the last SSE chunk:

async def async_post_call_streaming_iterator_hook(self, user_api_key_dict, response, request_data):
    async for chunk in response:
        if chunk.choices and chunk.choices[0].finish_reason:
            chunk.custom_field = {"key": "value"}
        yield chunk

On the client, consume the stream and check for the custom field:

response = litellm.completion(model="gpt-4", messages=[...], stream=True)
for chunk in response:
    data = chunk.model_dump()
    if "custom_field" in data:
        print("Found custom_field")  # Should print now

The custom field should now be present in the last chunk's JSON.

Extra Tips

Make sure to test the fix with different scenarios, including various types of custom attributes and chunk contents.
Consider adding additional logging or debugging statements to ensure the preserve_upstream_non_openai_attributes() function is being called correctly.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #ssr #installation #tensor shape #autograd error #agent execution #callback error #memory management #API rate limit

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: Streaming final chunk drops non-OpenAI attributes (preserve_upstream_non_openai_attributes not called) [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #23450: fix: preserve non-OpenAI attributes in final streaming chunk

Description (problem / solution / changelog)

Summary

Root Cause

Fix

Testing

Changed files

Code Example

Check for existing issues

What happened?

Root cause

Steps to Reproduce

Expected behavior

What part of LiteLLM is this about?

What LiteLLM version are you on?

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING