litellm - 💡(How to fix) Fix [Bug] aresponses streaming on OpenAI emits Pydantic serializer warning: chat-completion Usage assigned to ResponseAPIUsage field [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#26784Fetched 2026-04-30 06:19:55
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Participants
Timeline (top)
labeled ×1

Error Message

LiteLLM:ERROR: litellm_logging.py:5553 - Error creating standard logging object - Pydantic serializer warnings: ...

Code Example

UserWarning: Pydantic serializer warnings:
  PydanticSerializationUnexpectedValue(Expected `ResponseAPIUsage` -
  serialized value may not be as expected
  [field_name='usage',
   input_value={'completion_tokens': 8, ..., 'video_tokens': None}}, input_type=dict])

---

LiteLLM:ERROR: litellm_logging.py:5553 - Error creating standard logging object - Pydantic serializer warnings: ...

---

import asyncio
import os
import warnings

import litellm

os.environ["OPENAI_API_KEY"] = "sk-..."  # replace

caught = []
def _capture(message, category, filename, lineno, file=None, line=None):
    caught.append((category.__name__, str(message), filename.split("/")[-1], lineno))
warnings.showwarning = _capture

INPUT_ITEMS = [
    {"role": "system", "content": "You answer in one short sentence."},
    {"role": "user", "content": "What is the capital of France?"},
]

async def non_stream():
    caught.clear()
    resp = await litellm.aresponses(
        model="gpt-4o-mini", input=INPUT_ITEMS, max_output_tokens=32,
    )
    print("== NON-STREAM ==")
    print(f"  usage type: {type(resp.usage).__name__}")
    for w in caught: print(f"  WARNING: {w[0]} at {w[2]}:{w[3]}")
    if not caught: print("  (no warnings)")

async def stream():
    caught.clear()
    response = await litellm.aresponses(
        model="gpt-4o-mini", input=INPUT_ITEMS, max_output_tokens=32, stream=True,
    )
    async for event in response:
        pass
    print("== STREAM ==")
    for w in caught:
        print(f"  WARNING: {w[0]} at {w[2]}:{w[3]}\n           {str(w[1])[:300]}")
    if not caught: print("  (no warnings)")

async def main():
    await non_stream()
    await stream()
    await asyncio.sleep(0.3)  # let background logging callback fire

asyncio.run(main())

---

== NON-STREAM ==
  usage type: ResponseAPIUsage
  (no warnings)
== STREAM ==
  WARNING: UserWarning at main.py:464
           Pydantic serializer warnings:
  PydanticSerializationUnexpectedValue(Expected `ResponseAPIUsage` -
  serialized value may not be as expected [field_name='usage',
  input_value={'completion_tokens': 8, ..., 'video_tokens': None}}, input_type=dict])
RAW_BUFFERClick to expand / collapse

What's going wrong

Calling litellm.aresponses(model="gpt-4o-mini", stream=True, ...) emits a Pydantic serializer warning during the standard logging callback:

UserWarning: Pydantic serializer warnings:
  PydanticSerializationUnexpectedValue(Expected `ResponseAPIUsage` -
  serialized value may not be as expected
  [field_name='usage',
   input_value={'completion_tokens': 8, ..., 'video_tokens': None}}, input_type=dict])

LiteLLM also logs its own line:

LiteLLM:ERROR: litellm_logging.py:5553 - Error creating standard logging object - Pydantic serializer warnings: ...

The non-streaming call (stream=False) on the same model is clean. The same streaming call on claude-haiku-4-5 and gemini/gemini-2.5-flash is also clean — so the issue is specific to OpenAI's streaming Responses path.

Script to Reproduce

import asyncio
import os
import warnings

import litellm

os.environ["OPENAI_API_KEY"] = "sk-..."  # replace

caught = []
def _capture(message, category, filename, lineno, file=None, line=None):
    caught.append((category.__name__, str(message), filename.split("/")[-1], lineno))
warnings.showwarning = _capture

INPUT_ITEMS = [
    {"role": "system", "content": "You answer in one short sentence."},
    {"role": "user", "content": "What is the capital of France?"},
]

async def non_stream():
    caught.clear()
    resp = await litellm.aresponses(
        model="gpt-4o-mini", input=INPUT_ITEMS, max_output_tokens=32,
    )
    print("== NON-STREAM ==")
    print(f"  usage type: {type(resp.usage).__name__}")
    for w in caught: print(f"  WARNING: {w[0]} at {w[2]}:{w[3]}")
    if not caught: print("  (no warnings)")

async def stream():
    caught.clear()
    response = await litellm.aresponses(
        model="gpt-4o-mini", input=INPUT_ITEMS, max_output_tokens=32, stream=True,
    )
    async for event in response:
        pass
    print("== STREAM ==")
    for w in caught:
        print(f"  WARNING: {w[0]} at {w[2]}:{w[3]}\n           {str(w[1])[:300]}")
    if not caught: print("  (no warnings)")

async def main():
    await non_stream()
    await stream()
    await asyncio.sleep(0.3)  # let background logging callback fire

asyncio.run(main())

Output:

== NON-STREAM ==
  usage type: ResponseAPIUsage
  (no warnings)
== STREAM ==
  WARNING: UserWarning at main.py:464
           Pydantic serializer warnings:
  PydanticSerializationUnexpectedValue(Expected `ResponseAPIUsage` -
  serialized value may not be as expected [field_name='usage',
  input_value={'completion_tokens': 8, ..., 'video_tokens': None}}, input_type=dict])

Environment

litellm1.83.2
python3.10.13
pydantic2.12.5
platformmacOS (darwin 25.3.0)
modelgpt-4o-mini

extent analysis

TL;DR

The issue can be fixed by updating the Pydantic serialization to handle the unexpected usage value in the streaming response from the gpt-4o-mini model.

Guidance

  • The error occurs because the usage value in the streaming response is a dictionary, but Pydantic expects a ResponseAPIUsage object.
  • To fix this, you can try updating the litellm library to a version that handles this case correctly, or modify the serialization code to handle the dictionary value.
  • You can also try disabling the Pydantic serializer warnings to suppress the error, but this may not be a recommended solution.
  • Verify that the fix worked by running the script again and checking for the absence of the Pydantic serializer warning.

Example

No code example is provided as the issue is related to the internal implementation of the litellm library and Pydantic serialization.

Notes

The issue is specific to the gpt-4o-mini model and the streaming response, so the fix may need to be tailored to this specific case.

Recommendation

Apply a workaround by disabling the Pydantic serializer warnings or modifying the serialization code to handle the dictionary value, as updating the litellm library to a fixed version may not be available.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING