litellm - ✅(Solved) Fix [Bug]: Redis spend counter reseed is non-idempotent across pods, inflating budget enforcement values (regression from #26459) [1 pull requests, 1 participants]

silencedoctor · 2026-05-19T11:31:07Z

[litellm] PR 28248: fix proxy : make spend counter reseed idempotent across pods via SET NX - Repository: BerriAI/litellm - Author: silencedoctor - State: open… # PR #28248: fix(proxy): make spend counter reseed idempotent across pods via SET NX - Repository: BerriAI/litellm - Author: silencedoctor - State: open | merged: False - Link: https://github.com/BerriAI/litellm/pull/28248 ## Description (problem / solution / changelog) ## Relevant issues Fixes #28247 ## Pre-Submission checklist **Please complete all items before asking a LiteLLM maintainer to review your PR** - [x] I have Added testing in the [`tests/test_litellm/`](https://github.com/BerriAI/litellm/tree/main/tests/test_litellm) directory, **Adding at least 1 test is a hard requirement** - [x] My PR passes all unit tests on [`make test-unit`](https://docs.litellm.ai/docs/extras/contributing_code) - [x] My PR's scope is as isolated as possible, it only solves 1 specific problem - [ ] I have requested a Greptile review by commenting `@greptileai` and received a **Confidence Score of at least 4/5** before requesting a maintainer review ## Type 🐛 Bug Fix ## Changes ### Problem `SpendCounterReseed.coalesced()` re-seeds the Redis spend counter from DB using `async_increment()` (Redis `INCRBYFLOAT`), which is **additive, not idempotent**. The per-process `asyncio.Lock` added in #26459 collapses duplicate reseeds within a single pod, but provides no cross-pod coordination. With `N` proxy pods racing the same cold or expired counter: ``` Pod A: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = X Pod B: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 2X Pod C: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 3X ... ``` Subsequent `get_current_spend()` calls in `user_api_key_auth → common_checks` read the inflated Redis value and raise `BudgetExceededError`, falsely rejecting requests well under the configured limit. ### Fix Switch the reseed write to `async_set_cache(..., nx=True)` and, on a lost race, read back the winner's value — **mirroring the existing pattern already used by `coalesced_window()` in the very same file** (`litellm/proxy/db/spend_counter_reseed.py`). After this PR, both reseed paths use consistent, idempotent semantics. ### Why this fix should be easy to merge - **Minimal**: ~15-line change in one function. - **Reuses the existing pattern from the same file**: `RedisCache.async_set_cache(..., nx=True)` already exists, and `coalesced_window()` already uses it. This PR brings `coalesced()` in line with its sibling. - **No new public APIs, config flags, or dependencies.** - **Single-pod behavior is unchanged**: `INCRBYFLOAT` from absent ≡ `SET`, so this PR has no observable effect on single-pod deployments. It only fixes the broken multi-pod path. - **Scope is isolated**: one function, one concern, no incidental refactors. ### Tests Added `test_coalesced_reseed_idempotent_under_concurrent_multi_pod_reseed` in `tests/test_litellm/proxy/test_proxy_server.py`: - Simulates `N` pods by replacing `SpendCounterReseed._get_lock` so each concurrent caller receives an independent `asyncio.Lock` (mirroring the real multi-pod condition where the in-process lock provides no cross-pod coordination). - Backs `spend_counter_cache.redis_cache` with a small in-memory fake that implements `async_set_cache(nx=...)`, `async_get_cache`, `async_increment`, and serializes operations with an `asyncio.Lock` (single shared Redis instance). - Asserts that after `N = 5` concurrent `coalesced()` calls, `redis.get(counter_key) == db_spend` (not `N × db_spend`), and that all callers observe the same value. The new test fails on `main` (final value `500.0`) and passes after this patch (final value `100.0`). Existing reseed tests (`test_init_and_increment_spend_counter_reseeds_from_db_on_counter_miss`, `test_init_spend_counter_redis_clean_miss_skips_stale_in_memory`, `test_get_current_spend_reseeds_from_db_when_counter_missing`, `test_get_current_spend_coalesces_concurrent_reseeds`, `test_concurrent_read_and_write_paths_share_one_db_query`, `test_reseed_warms_cache_even_on_zero_db_spend`) had their fake-Redis stubs extended with the `async_set_cache(nx=True)` side-effect so they continue to model the new (idempotent) reseed path correctly. No assertion semantics were weakened. ## Screenshots / Proof of Fix ``` $ uv run pytest tests/test_litellm/proxy/test_proxy_server.py -v \ -k "reseed or spend_counter or current_spend" ... 27 passed ``` Including the new regression test: ``` tests/test_litellm/proxy/test_proxy_server.py::test_coalesced_reseed_idempotent_under_concurrent_multi_pod_reseed PASSED ``` ## Changed files - `litellm/proxy/db/spend_counter_reseed.py` (modified, +26/-6) - `tests/test_litellm/proxy/test_proxy_server.py` (modified, +203/-12) ## Fixed - Fixed by PR: fix(proxy): make spend counter reseed idempotent across pods via SET NX (https://github.com/BerriAI/litellm/pull/28248) ### Check for existing issues - [x] I have searched

litellm2026-05-19 11:31:07

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#28247•Fetched 2026-05-20 03:40:32

View on GitHub

Comments

Participants

Timeline

Reactions

Author

silencedoctor

Participants

silencedoctor

Timeline (top)

cross-referenced ×1labeled ×1

Error Message

litellm.proxy._types.BudgetExceededError: Budget has been exceeded! ... current_spend ≈ N × db_spend (N = number of pods that raced the reseed)

Root Cause

Root cause

Fix Action

Fixed

Fixed by PR: fix(proxy): make spend counter reseed idempotent across pods via SET NX (https://github.com/BerriAI/litellm/pull/28248)

PR fix notes

PR #28248: fix(proxy): make spend counter reseed idempotent across pods via SET NX

Repository: BerriAI/litellm
Author: silencedoctor
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/28248

Description (problem / solution / changelog)

Relevant issues

Fixes #28247

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

Problem

SpendCounterReseed.coalesced() re-seeds the Redis spend counter from DB using async_increment() (Redis INCRBYFLOAT), which is additive, not idempotent. The per-process asyncio.Lock added in #26459 collapses duplicate reseeds within a single pod, but provides no cross-pod coordination. With N proxy pods racing the same cold or expired counter:

Pod A: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = X
Pod B: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 2X
Pod C: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 3X
...

Subsequent get_current_spend() calls in user_api_key_auth → common_checks read the inflated Redis value and raise BudgetExceededError, falsely rejecting requests well under the configured limit.

Fix

Switch the reseed write to async_set_cache(..., nx=True) and, on a lost race, read back the winner's value — mirroring the existing pattern already used by coalesced_window() in the very same file (litellm/proxy/db/spend_counter_reseed.py). After this PR, both reseed paths use consistent, idempotent semantics.

Why this fix should be easy to merge

Minimal: ~15-line change in one function.
Reuses the existing pattern from the same file: RedisCache.async_set_cache(..., nx=True) already exists, and coalesced_window() already uses it. This PR brings coalesced() in line with its sibling.
No new public APIs, config flags, or dependencies.
Single-pod behavior is unchanged: INCRBYFLOAT from absent ≡ SET, so this PR has no observable effect on single-pod deployments. It only fixes the broken multi-pod path.
Scope is isolated: one function, one concern, no incidental refactors.

Tests

Added test_coalesced_reseed_idempotent_under_concurrent_multi_pod_reseed in tests/test_litellm/proxy/test_proxy_server.py:

Simulates N pods by replacing SpendCounterReseed._get_lock so each concurrent caller receives an independent asyncio.Lock (mirroring the real multi-pod condition where the in-process lock provides no cross-pod coordination).
Backs spend_counter_cache.redis_cache with a small in-memory fake that implements async_set_cache(nx=...), async_get_cache, async_increment, and serializes operations with an asyncio.Lock (single shared Redis instance).
Asserts that after N = 5 concurrent coalesced() calls, redis.get(counter_key) == db_spend (not N × db_spend), and that all callers observe the same value.

The new test fails on main (final value 500.0) and passes after this patch (final value 100.0).

Existing reseed tests (test_init_and_increment_spend_counter_reseeds_from_db_on_counter_miss, test_init_spend_counter_redis_clean_miss_skips_stale_in_memory, test_get_current_spend_reseeds_from_db_when_counter_missing, test_get_current_spend_coalesces_concurrent_reseeds, test_concurrent_read_and_write_paths_share_one_db_query, test_reseed_warms_cache_even_on_zero_db_spend) had their fake-Redis stubs extended with the async_set_cache(nx=True) side-effect so they continue to model the new (idempotent) reseed path correctly. No assertion semantics were weakened.

Screenshots / Proof of Fix

$ uv run pytest tests/test_litellm/proxy/test_proxy_server.py -v \
    -k "reseed or spend_counter or current_spend"
...
27 passed

Including the new regression test:

tests/test_litellm/proxy/test_proxy_server.py::test_coalesced_reseed_idempotent_under_concurrent_multi_pod_reseed PASSED

Changed files

litellm/proxy/db/spend_counter_reseed.py (modified, +26/-6)
tests/test_litellm/proxy/test_proxy_server.py (modified, +203/-12)

Code Example

await spend_counter_cache.redis_cache.async_increment(
    key=counter_key,
    value=db_spend,
    refresh_ttl=True,
)

---

Pod A: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = X
Pod B: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 2X
Pod C: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 3X
...

---

litellm.proxy._types.BudgetExceededError: Budget has been exceeded! ...
  current_spend ≈ N × db_spend  (N = number of pods that raced the reseed)

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

After #26459 ([Fix] Reseed enforcement read path from DB on counter miss), the Redis spend counter used by budget enforcement can be inflated to roughly N × db_spend when N proxy pods race on a cold or expired counter. Affected entities (keys / teams / team members / users / orgs) are then incorrectly rejected with BudgetExceededError in the auth layer even though their actual DB spend is well below the configured limit.

Root cause

litellm/proxy/db/spend_counter_reseed.py::SpendCounterReseed.coalesced() re-seeds Redis with:

await spend_counter_cache.redis_cache.async_increment(
    key=counter_key,
    value=db_spend,
    refresh_ttl=True,
)

which translates to Redis INCRBYFLOAT. INCRBYFLOAT is additive, not idempotent. The per-process asyncio.Lock added in #26459 collapses duplicate reseeds within a single pod, but it provides no coordination across pods. With multiple pods racing the same cold key:

Pod A: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = X
Pod B: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 2X
Pod C: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 3X
...

Subsequent get_current_spend() calls in user_api_key_auth → common_checks read the inflated Redis value (Redis is preferred over the in-memory and DB paths) and raise BudgetExceededError.

Notably, the sibling function coalesced_window() in the same file already uses the correct idempotent pattern: async_set_cache(..., nx=True) and, on a lost race, read the winner's value. Only the non-window coalesced() path is affected.

Expected behavior

After any number of concurrent reseeds across any number of pods, the Redis spend counter must equal the authoritative DB spend (plus any genuinely concurrent response-cost increments) — and must never equal an integer multiple of the DB spend.

Symptoms an operator can check

GET <namespace>:spend:<scope>:<id> returns a value that is approximately an integer multiple (2×, 3×, 4×, …) of the entity's true DB spend.
BudgetExceededError raised at request auth time with a current_spend suspiciously close to a multiple of the entity's DB spend.
After the key's TTL elapses and a single pod wins the next rebuild, the value can drop back to the true DB spend (or to a single response cost), so the inflation looks intermittent.

Steps to Reproduce

Deploy at least 2 LiteLLM proxy pods sharing the same Redis and the same Postgres backend.
Pick an entity (key / team / team member / user / org) with non-zero spend in DB.
Delete the corresponding <namespace>:spend:<scope>:<id> Redis key (or wait for its TTL to expire — default 60s when default_redis_ttl is unset).
Concurrently send requests across pods that exercise the budget read path for that entity.
GET <namespace>:spend:<scope>:<id> afterwards returns approximately N × db_spend.

A self-contained pure-Python regression test (no Redis / Postgres required) is included in the accompanying PR. It simulates N pods by giving each concurrent caller its own asyncio.Lock (mirroring the real-world condition where the in-process singleflight lock provides no cross-pod coordination) and asserts the final Redis value equals db_spend, not N × db_spend.

Relevant log output

litellm.proxy._types.BudgetExceededError: Budget has been exceeded! ...
  current_spend ≈ N × db_spend  (N = number of pods that raced the reseed)

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

main (regression introduced by #26459).

Twitter / LinkedIn details

N/A

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: Redis spend counter reseed is non-idempotent across pods, inflating budget enforcement values (regression from #26459) [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause

Fix Action

Fixed

PR fix notes

PR #28248: fix(proxy): make spend counter reseed idempotent across pods via SET NX

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Type

Changes

Problem

Fix

Why this fix should be easy to merge

Tests

Screenshots / Proof of Fix

Changed files

Code Example

Check for existing issues

What happened?

Root cause

Expected behavior

Symptoms an operator can check

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Still need to ship something?

RELATED_DISCOVERY

TRENDING