litellm - 💡(How to fix) Fix [Bug]: OTel gen_ai.output.messages is never set for Responses API (/v1/responses) [1 participants]

Q: Expected behavior

`gen_ai.output.messages` should be populated with the response content for Responses API calls, the same way it is for `/chat/completions`.

litellm2026-04-16 05:49:51

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#25840•Fetched 2026-04-17 08:28:46

View on GitHub

Comments

Participants

Timeline

Reactions

Author

robin-fiddler

Participants

robin-fiddler

Timeline (top)

labeled ×2

Root Cause

File: litellm/integrations/opentelemetry.py, method set_attributes(), line 1705:

if response_obj.get("choices"):
    transformed_choices = self._transform_choices_to_otel_semantic_conventions(
        response_obj.get("choices")
    )
    self.safe_set_attribute(
        span=span,
        key=SpanAttributes.GEN_AI_OUTPUT_MESSAGES.value,
        value=safe_dumps(transformed_choices),
    )

ResponsesAPIResponse (defined in litellm/types/llms/openai.py:1247) has an output field, not choices. So response_obj.get("choices") returns None and the entire block is skipped — including gen_ai.output.messages, gen_ai.response.finish_reasons, and tool call extraction.

Token counts work because they use a separate path at line 1626 via response_obj.get("usage"), which ResponsesAPIResponse does have (its usage is already normalized to chat-completion format with prompt_tokens/completion_tokens by _transform_usage_objects() in litellm_logging.py:1778 before the callback fires).

Code Example

litellm_settings:
     callbacks: ["otel"]

---

curl -X POST http://localhost:4000/v1/responses \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer sk-1234" \
     -d '{"model": "gpt-4o-mini", "input": "What is 2+2?"}'

---

gen_ai.output.messages: [{"role":"assistant","parts":[{"type":"text","content":"..."}]}]

---

if response_obj.get("choices"):
    transformed_choices = self._transform_choices_to_otel_semantic_conventions(
        response_obj.get("choices")
    )
    self.safe_set_attribute(
        span=span,
        key=SpanAttributes.GEN_AI_OUTPUT_MESSAGES.value,
        value=safe_dumps(transformed_choices),
    )

---

if response_obj.get("choices"):
    # existing chat/completions output extraction ...
elif response_obj.get("output"):
    # Responses API: extract text from output items
    output_messages = []
    for item in response_obj.get("output", []):
        if isinstance(item, dict) and item.get("type") == "message":
            for content in item.get("content", []):
                if isinstance(content, dict) and content.get("type") == "output_text":
                    output_messages.append({
                        "role": "assistant",
                        "parts": [{"type": "text", "content": content.get("text", "")}]
                    })
    if output_messages:
        self.safe_set_attribute(
            span=span,
            key=SpanAttributes.GEN_AI_OUTPUT_MESSAGES.value,
            value=safe_dumps(output_messages),
        )
    # Also extract finish reason from ResponsesAPIResponse.status
    status = response_obj.get("status")
    if status:
        self.safe_set_attribute(
            span=span,
            key=SpanAttributes.GEN_AI_RESPONSE_FINISH_REASONS.value,
            value=safe_dumps([status]),
        )

RAW_BUFFERClick to expand / collapse

What happened?

When using the /v1/responses (Responses API) endpoint, the OTel span is missing gen_ai.output.messages entirely. The LLM response content is lost from traces. This affects both proxy mode and the SDK (litellm.responses() / litellm.aresponses()).

Token counts (gen_ai.usage.*), costs (gen_ai.cost.*), and input messages (gen_ai.input.messages) all work correctly — only the output content is missing.

Related issues

#25240 / PR #25309 — Fixes llm.None.* attribute keys and the empty gen_ai.system by adding custom_llm_provider to litellm_params. Once merged, that PR resolves the provider-identification side of the problem. However, gen_ai.output.messages will still be missing even after that fix lands.
#24057 — Separate OTel attribute type issue for gen_ai.prompt (not directly related but in the same integration file).

Environment

LiteLLM version: 1.74.4 (also verified against current source)
Python: 3.12
Proxy mode with callbacks: ["otel"]
Tested with gpt-4o-mini via OpenAI

Steps to Reproduce

Start the LiteLLM proxy with OTel enabled:
```
litellm_settings:
  callbacks: ["otel"]
```

Send a request to /v1/responses:

curl -X POST http://localhost:4000/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{"model": "gpt-4o-mini", "input": "What is 2+2?"}'

Inspect the OTel span attributes on the parent span.

Actual behavior

The parent span has no gen_ai.output.messages attribute. For comparison, /chat/completions correctly produces:

gen_ai.output.messages: [{"role":"assistant","parts":[{"type":"text","content":"..."}]}]

The raw_gen_ai_request child span (for non-streaming) does contain the response text under llm.<provider>.output, confirming the response exists — it just never gets written to gen_ai.output.messages on the parent span.

Expected behavior

gen_ai.output.messages should be populated with the response content for Responses API calls, the same way it is for /chat/completions.

Root Cause

File: litellm/integrations/opentelemetry.py, method set_attributes(), line 1705:

if response_obj.get("choices"):
    transformed_choices = self._transform_choices_to_otel_semantic_conventions(
        response_obj.get("choices")
    )
    self.safe_set_attribute(
        span=span,
        key=SpanAttributes.GEN_AI_OUTPUT_MESSAGES.value,
        value=safe_dumps(transformed_choices),
    )

Suggested Fix

Add an elif branch after the existing choices check to handle the Responses API output structure:

if response_obj.get("choices"):
    # existing chat/completions output extraction ...
elif response_obj.get("output"):
    # Responses API: extract text from output items
    output_messages = []
    for item in response_obj.get("output", []):
        if isinstance(item, dict) and item.get("type") == "message":
            for content in item.get("content", []):
                if isinstance(content, dict) and content.get("type") == "output_text":
                    output_messages.append({
                        "role": "assistant",
                        "parts": [{"type": "text", "content": content.get("text", "")}]
                    })
    if output_messages:
        self.safe_set_attribute(
            span=span,
            key=SpanAttributes.GEN_AI_OUTPUT_MESSAGES.value,
            value=safe_dumps(output_messages),
        )
    # Also extract finish reason from ResponsesAPIResponse.status
    status = response_obj.get("status")
    if status:
        self.safe_set_attribute(
            span=span,
            key=SpanAttributes.GEN_AI_RESPONSE_FINISH_REASONS.value,
            value=safe_dumps([status]),
        )

Impact

Any downstream OTel consumer (Datadog, Honeycomb, Fiddler, custom pipelines, etc.) that relies on gen_ai.output.messages for response content will receive empty output for all /v1/responses calls. This makes the Responses API appear to produce no output in observability dashboards, even though the LLM call succeeds and tokens are counted correctly.

What part of LiteLLM is this about?

Proxy, OpenTelemetry integration

What LiteLLM version are you on?

1.74.4+ (current source as of April 2026)

extent analysis

TL;DR

The issue can be fixed by adding an elif branch to handle the Responses API output structure in the set_attributes() method of opentelemetry.py.

Guidance

The root cause is identified as the response_obj.get("choices") check in opentelemetry.py, which returns None for Responses API calls because they have an output field instead of choices.
To fix this, add an elif branch to handle the output structure, as suggested in the issue.
The new branch should extract the text from the output items and set the gen_ai.output.messages attribute accordingly.
Additionally, extract the finish reason from the status field of the ResponsesAPIResponse object.

Example

The suggested fix provides a code snippet that demonstrates how to handle the output structure:

elif response_obj.get("output"):
    # Responses API: extract text from output items
    output_messages = []
    for item in response_obj.get("output", []):
        if isinstance(item, dict) and item.get("type") == "message":
            for content in item.get("content", []):
                if isinstance(content, dict) and content.get("type") == "output_text":
                    output_messages.append({
                        "role": "assistant",
                        "parts": [{"type": "text", "content": content.get("text", "")}]
                    })
    if output_messages:
        self.safe_set_attribute(
            span=span,
            key=SpanAttributes.GEN_AI_OUTPUT_MESSAGES.value,
            value=safe_dumps(output_messages),
        )

Notes

The fix assumes that the output structure of the Responses API response is consistent with the expected format. If the format varies, additional handling may be necessary.

Recommendation

Apply the suggested workaround by adding the elif branch to handle the output structure, as this will fix the issue and populate the gen_ai.output.messages attribute correctly.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

gen_ai.output.messages should be populated with the response content for Responses API calls, the same way it is for /chat/completions.

#api #autograd error #model save/load #optimization #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix [Bug]: OTel gen_ai.output.messages is never set for Responses API (/v1/responses) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

What happened?

Related issues

Environment

Steps to Reproduce

Actual behavior

Expected behavior

Root Cause

Suggested Fix

Impact

What part of LiteLLM is this about?

What LiteLLM version are you on?

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix [Bug]: OTel gen_ai.output.messages is never set for Responses API (/v1/responses) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

What happened?

Related issues

Environment

Steps to Reproduce

Actual behavior

Expected behavior

Root Cause

Suggested Fix

Impact

What part of LiteLLM is this about?

What LiteLLM version are you on?

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING