litellm - ✅(Solved) Fix [Bug]: background Responses API stream resume on retrieve is not handled correctly via the proxy [1 pull requests, 1 participants]

litellm2026-04-29 09:07:07

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#26762•Fetched 2026-04-30 06:20:17

View on GitHub

Comments

Participants

Timeline

Reactions

Author

chenyzan

Participants

chenyzan

Timeline (top)

labeled ×2cross-referenced ×1

Fix Action

Fixed

Fixed by PR: feat(responses): support cursor-based stream resume on retrieve (https://github.com/BerriAI/litellm/pull/26750)

PR fix notes

PR #26750: feat(responses): support cursor-based stream resume on retrieve

Repository: BerriAI/litellm
Author: chenyzan
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/26750

Description (problem / solution / changelog)

Relevant issues

Related bug report: #26762

Follow-up to #26671, retargeted to the public OSS staging branch and updated to fix the synchronous retrieve streaming gap identified during review.

This PR adds support for OpenAI-style client.responses.retrieve(response_id, stream=True, starting_after=N) across LiteLLM's retrieve path.

Before this change:

the proxy retrieve endpoint forwarded stream / starting_after query params for provider-backed GET requests
the async retrieve path could open an SSE stream and return a ResponsesAPIStreamingIterator
the sync get_responses(stream=True) path still issued a normal GET and then treated the SSE body like a non-streaming JSON response

This PR closes that gap and adds regression coverage for the sync and proxy paths.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link: after PR creation
CI run for the last commit
Link: after final push
Merge / cherry-pick CI run
Links: N/A (maintainer-owned)

Screenshots / Proof of Fix

Added regression coverage for:

async retrieve streaming returning a ResponsesAPIStreamingIterator
sync retrieve streaming returning a SyncResponsesAPIStreamingIterator
sync HTTPHandler.get(..., stream=True) opening an actual streaming GET request
proxy forwarding of stream / starting_after query params on GET /v1/responses/{response_id}
proxy validation for non-integer starting_after

Local targeted verification:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 .venv/bin/pytest -p pytest_asyncio.plugin \
  tests/test_litellm/responses/test_get_responses_stream_resume.py \
  tests/test_litellm/proxy/response_api_endpoints/test_endpoints.py \
  -x -vv

Result: 13 passed

Type

🆕 New Feature 🐛 Bug Fix

Changes

add sync streaming GET support to HTTPHandler.get()
return SyncResponsesAPIStreamingIterator from get_responses(..., stream=True) instead of treating the SSE body as a non-streaming response
preserve the existing async retrieve streaming behavior
add sync-path regression tests for responses retrieve stream resume
add proxy endpoint tests for query-param forwarding and invalid starting_after handling

Changed files

litellm/llms/azure/responses/transformation.py (modified, +15/-4)
litellm/llms/base_llm/responses/transformation.py (modified, +11/-0)
litellm/llms/custom_httpx/http_handler.py (modified, +81/-8)
litellm/llms/custom_httpx/llm_http_handler.py (modified, +119/-5)
litellm/llms/manus/responses/transformation.py (modified, +6/-0)
litellm/llms/openai/responses/transformation.py (modified, +11/-1)
litellm/llms/volcengine/responses/transformation.py (modified, +5/-0)
litellm/proxy/response_api_endpoints/endpoints.py (modified, +22/-0)
litellm/responses/main.py (modified, +35/-7)
tests/test_litellm/llms/custom_httpx/test_credential_leak_prevention.py (modified, +64/-0)
tests/test_litellm/proxy/response_api_endpoints/test_endpoints.py (modified, +81/-10)
tests/test_litellm/responses/test_get_responses_stream_resume.py (added, +279/-0)

RAW_BUFFERClick to expand / collapse

What happened?

I found this while using the OpenAI Responses API background mode through the LiteLLM proxy.

When resuming a background response via:

GET /v1/responses/{response_id}?stream=true&starting_after=<sequence_number>

LiteLLM did not correctly handle the retrieve streaming path.

I expected the proxy to support OpenAI-style cursor-based stream resume on the retrieve endpoint and return a valid SSE stream for background responses.

Steps to Reproduce

Send a Responses API request through the LiteLLM proxy using OpenAI background mode.
Wait until the response can be retrieved by response_id.
Call GET /v1/responses/{response_id}?stream=true&starting_after=<sequence_number>.
Observe that the retrieve streaming path is not handled correctly.

Relevant log output

No additional production logs attached.

I was able to reproduce this consistently through the proxy retrieve path and then narrowed it down to the retrieve streaming flow.

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on?

v1.83.14

Additional context

This was discovered while trying to use background mode together with streaming resume, which is a valid OpenAI Responses API workflow.

A fix has been proposed in PR #26750.

extent analysis

TL;DR

The issue can likely be resolved by applying the fix proposed in PR #26750 to handle the retrieve streaming path correctly in the LiteLLM proxy.

Guidance

The problem seems to stem from the LiteLLM proxy not correctly handling the streaming path when resuming a background response, which is a valid workflow in the OpenAI Responses API.
To verify the issue, follow the steps to reproduce provided, focusing on the GET /v1/responses/{response_id}?stream=true&starting_after=<sequence_number> call.
Applying the fix from PR #26750 should address the issue by ensuring the proxy supports OpenAI-style cursor-based stream resume on the retrieve endpoint.
Before applying any fixes, ensure you are on the correct version of LiteLLM (v1.83.14 or later) to avoid version conflicts.

Example

No specific code example is provided due to the nature of the issue, but applying the changes from PR #26750 should include modifications to how the LiteLLM proxy handles the starting_after parameter in streaming requests.

Notes

The fix proposed in PR #26750 is specific to handling the retrieve streaming path in the context of OpenAI background mode through the LiteLLM proxy. This solution assumes that the issue is isolated to this particular workflow and may not address similar issues in other parts of the system.

Recommendation

Apply the workaround by integrating the fix proposed in PR #26750, as it directly addresses the identified issue with the retrieve streaming path in the LiteLLM proxy.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #tool integration #LLM response #prompt template #agent execution

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - ✅(Solved) Fix [Bug]: background Responses API stream resume on retrieve is not handled correctly via the proxy [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #26750: feat(responses): support cursor-based stream resume on retrieve

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Changed files

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on?

Additional context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

litellm - ✅(Solved) Fix [Bug]: background Responses API stream resume on retrieve is not handled correctly via the proxy [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #26750: feat(responses): support cursor-based stream resume on retrieve

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Changed files

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on?

Additional context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING