litellm - ✅(Solved) Fix [Bug]: Polling Adds Duplicate Costs to Key Budget [1 pull requests, 1 comments, 2 participants]

Q: Expected behavior

The spend should remain at $0.01. OpenAI does not charge for GET /v1/responses/{id} — it's a simple retrieval of an already-completed response. The key spend should only reflect the original POST request.

litellm2026-04-02 19:34:40

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#25015•Fetched 2026-04-08 02:35:06

View on GitHub

Comments

Participants

Timeline

Reactions

Author

dgu1-godaddy

Participants

dgu1-godaddy

meutsabdahal

Timeline (top)

cross-referenced ×3labeled ×3commented ×1

Root Cause

Check key spend again

curl -X GET "http://localhost:4000/key/info?key=sk-test-key" \
  -H "Authorization: Bearer sk-master-1234"

Now the spend value becomes ($0.04, 4 times the cost because we polled 3 extra times)

Fix Action

Fixed

Fixed by PR: fix(proxy): make responses polling retrieval non-billable (#25015) (https://github.com/BerriAI/litellm/pull/25061)

PR fix notes

PR #25061: fix(proxy): make responses polling retrieval non-billable (#25015)

Repository: BerriAI/litellm
Author: meutsabdahal
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/25061

Description (problem / solution / changelog)

Relevant issues

Fixes #25015

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR.

I have added testing in the tests/test_litellm/ directory, adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a confidence score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you are seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

Bug Fix
Test

Changes

Fixes duplicate spend tracking for OpenAI Responses polling retrieval calls.
Treats retrieval call types get_responses and aget_responses as non-billable in proxy cost tracking.
Forces response_cost to 0.0 for these retrieval calls before spend counters and database updates are applied.
Adds regression coverage in proxy hook tests to verify retrieval polling does not increase spend.
Keeps the PR scope focused on Issue #25015 only.
Local validation: ruff check passed for changed files and the targeted proxy hook test file passed.

Changed files

litellm/proxy/hooks/proxy_track_cost_callback.py (modified, +27/-5)
tests/test_litellm/proxy/hooks/test_proxy_track_cost_callback.py (modified, +57/-5)

Code Example

curl -X GET "http://localhost:4000/key/info?key=sk-test-key" \
  -H "Authorization: Bearer sk-master-1234"

---

curl -X POST http://localhost:4000/v1/responses \
  -H "Authorization: Bearer sk-test-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "o3-deep-research",
    "input": "What are the latest advances in quantum computing?"
  }'

---

curl -X GET "http://localhost:4000/key/info?key=sk-test-key" \
  -H "Authorization: Bearer sk-master-1234"

---

# Poll 1
curl -X GET http://localhost:4000/v1/responses/resp_abc123 \
  -H "Authorization: Bearer sk-test-key"

# Poll 2
curl -X GET http://localhost:4000/v1/responses/resp_abc123 \
  -H "Authorization: Bearer sk-test-key"

# Poll 3
curl -X GET http://localhost:4000/v1/responses/resp_abc123 \
  -H "Authorization: Bearer sk-test-key"

---

curl -X GET "http://localhost:4000/key/info?key=sk-test-key" \
  -H "Authorization: Bearer sk-master-1234"

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When you use the OpenAI Responses API through LiteLLM's proxy (e.g., POST /v1/responses), and then poll for results via GET /v1/responses/{resp_id}, every successful GET poll triggers the full cost-tracking pipeline, incrementing the key/user/team/org spend each time - even though OpenAI doesn't charge for retrieval GETs.

Steps to Reproduce

Check initial key spend:

curl -X GET "http://localhost:4000/key/info?key=sk-test-key" \
  -H "Authorization: Bearer sk-master-1234"

Note the spend value, e.g. $0.0.

Create a response using the Responses API

curl -X POST http://localhost:4000/v1/responses \
  -H "Authorization: Bearer sk-test-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "o3-deep-research",
    "input": "What are the latest advances in quantum computing?"
  }'

Save the returned id (e.g. resp_abc123...).

Check key spend after the initial request

curl -X GET "http://localhost:4000/key/info?key=sk-test-key" \
  -H "Authorization: Bearer sk-master-1234"

Note the spend value (e.g. $0.01)

Poll the same response multiple times via GET

# Poll 1
curl -X GET http://localhost:4000/v1/responses/resp_abc123 \
  -H "Authorization: Bearer sk-test-key"

# Poll 2
curl -X GET http://localhost:4000/v1/responses/resp_abc123 \
  -H "Authorization: Bearer sk-test-key"

# Poll 3
curl -X GET http://localhost:4000/v1/responses/resp_abc123 \
  -H "Authorization: Bearer sk-test-key"

Check key spend again

curl -X GET "http://localhost:4000/key/info?key=sk-test-key" \
  -H "Authorization: Bearer sk-master-1234"

Now the spend value becomes ($0.04, 4 times the cost because we polled 3 extra times)

Expected behavior

The spend should remain at $0.01. OpenAI does not charge for GET /v1/responses/{id} — it's a simple retrieval of an already-completed response. The key spend should only reflect the original POST request.

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.81.14-stable

Twitter / LinkedIn details

No response

extent analysis

TL;DR

Modify the cost-tracking pipeline to exclude GET requests for retrieving existing responses.

Guidance

Identify the specific part of the cost-tracking pipeline that increments the key/user/team/org spend for GET requests and modify it to only increment for POST requests or other chargeable actions.
Verify that the modification does not affect other parts of the system that rely on accurate spend tracking.
Consider adding a conditional check in the cost-tracking pipeline to exclude GET requests for retrieving existing responses, based on the API endpoint or request method.
Review the OpenAI API documentation to ensure that the cost-tracking pipeline aligns with their pricing model.

Example

# Pseudo-code example of a conditional check
if request_method == 'POST' or endpoint != '/v1/responses/{id}':
    # Increment spend
    update_spend()

Notes

The exact implementation details may vary depending on the underlying architecture and technology stack of the LiteLLM proxy.

Recommendation

Apply a workaround by modifying the cost-tracking pipeline to exclude GET requests for retrieving existing responses, as this is a more targeted solution that addresses the specific issue at hand.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #runtime error #dependency conflict #environment setup #docker error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: Polling Adds Duplicate Costs to Key Budget [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #25061: fix(proxy): make responses polling retrieval non-billable (#25015)

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Type

Changes

Changed files

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Expected behavior

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING