litellm - ✅(Solved) Fix [Bug]: Redis spend counter reseed is non-idempotent across pods, inflating budget enforcement values (regression from #26459) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#28247Fetched 2026-05-20 03:40:32
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
cross-referenced ×1labeled ×1

Error Message

litellm.proxy._types.BudgetExceededError: Budget has been exceeded! ... current_spend ≈ N × db_spend (N = number of pods that raced the reseed)

Root Cause

Root cause

Fix Action

Fixed

PR fix notes

PR #28248: fix(proxy): make spend counter reseed idempotent across pods via SET NX

Description (problem / solution / changelog)

Relevant issues

Fixes #28247

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

Problem

SpendCounterReseed.coalesced() re-seeds the Redis spend counter from DB using async_increment() (Redis INCRBYFLOAT), which is additive, not idempotent. The per-process asyncio.Lock added in #26459 collapses duplicate reseeds within a single pod, but provides no cross-pod coordination. With N proxy pods racing the same cold or expired counter:

Pod A: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = X
Pod B: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 2X
Pod C: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 3X
...

Subsequent get_current_spend() calls in user_api_key_auth → common_checks read the inflated Redis value and raise BudgetExceededError, falsely rejecting requests well under the configured limit.

Fix

Switch the reseed write to async_set_cache(..., nx=True) and, on a lost race, read back the winner's value — mirroring the existing pattern already used by coalesced_window() in the very same file (litellm/proxy/db/spend_counter_reseed.py). After this PR, both reseed paths use consistent, idempotent semantics.

Why this fix should be easy to merge

  • Minimal: ~15-line change in one function.
  • Reuses the existing pattern from the same file: RedisCache.async_set_cache(..., nx=True) already exists, and coalesced_window() already uses it. This PR brings coalesced() in line with its sibling.
  • No new public APIs, config flags, or dependencies.
  • Single-pod behavior is unchanged: INCRBYFLOAT from absent ≡ SET, so this PR has no observable effect on single-pod deployments. It only fixes the broken multi-pod path.
  • Scope is isolated: one function, one concern, no incidental refactors.

Tests

Added test_coalesced_reseed_idempotent_under_concurrent_multi_pod_reseed in tests/test_litellm/proxy/test_proxy_server.py:

  • Simulates N pods by replacing SpendCounterReseed._get_lock so each concurrent caller receives an independent asyncio.Lock (mirroring the real multi-pod condition where the in-process lock provides no cross-pod coordination).
  • Backs spend_counter_cache.redis_cache with a small in-memory fake that implements async_set_cache(nx=...), async_get_cache, async_increment, and serializes operations with an asyncio.Lock (single shared Redis instance).
  • Asserts that after N = 5 concurrent coalesced() calls, redis.get(counter_key) == db_spend (not N × db_spend), and that all callers observe the same value.

The new test fails on main (final value 500.0) and passes after this patch (final value 100.0).

Existing reseed tests (test_init_and_increment_spend_counter_reseeds_from_db_on_counter_miss, test_init_spend_counter_redis_clean_miss_skips_stale_in_memory, test_get_current_spend_reseeds_from_db_when_counter_missing, test_get_current_spend_coalesces_concurrent_reseeds, test_concurrent_read_and_write_paths_share_one_db_query, test_reseed_warms_cache_even_on_zero_db_spend) had their fake-Redis stubs extended with the async_set_cache(nx=True) side-effect so they continue to model the new (idempotent) reseed path correctly. No assertion semantics were weakened.

Screenshots / Proof of Fix

$ uv run pytest tests/test_litellm/proxy/test_proxy_server.py -v \
    -k "reseed or spend_counter or current_spend"
...
27 passed

Including the new regression test:

tests/test_litellm/proxy/test_proxy_server.py::test_coalesced_reseed_idempotent_under_concurrent_multi_pod_reseed PASSED

Changed files

  • litellm/proxy/db/spend_counter_reseed.py (modified, +26/-6)
  • tests/test_litellm/proxy/test_proxy_server.py (modified, +203/-12)

Code Example

await spend_counter_cache.redis_cache.async_increment(
    key=counter_key,
    value=db_spend,
    refresh_ttl=True,
)

---

Pod A: Redis miss → read DB (spend = X)INCRBYFLOAT +XRedis = X
Pod B: Redis miss → read DB (spend = X)INCRBYFLOAT +XRedis = 2X
Pod C: Redis miss → read DB (spend = X)INCRBYFLOAT +XRedis = 3X
...

---

litellm.proxy._types.BudgetExceededError: Budget has been exceeded! ...
  current_spend ≈ N × db_spend  (N = number of pods that raced the reseed)
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

After #26459 ([Fix] Reseed enforcement read path from DB on counter miss), the Redis spend counter used by budget enforcement can be inflated to roughly N × db_spend when N proxy pods race on a cold or expired counter. Affected entities (keys / teams / team members / users / orgs) are then incorrectly rejected with BudgetExceededError in the auth layer even though their actual DB spend is well below the configured limit.

Root cause

litellm/proxy/db/spend_counter_reseed.py::SpendCounterReseed.coalesced() re-seeds Redis with:

await spend_counter_cache.redis_cache.async_increment(
    key=counter_key,
    value=db_spend,
    refresh_ttl=True,
)

which translates to Redis INCRBYFLOAT. INCRBYFLOAT is additive, not idempotent. The per-process asyncio.Lock added in #26459 collapses duplicate reseeds within a single pod, but it provides no coordination across pods. With multiple pods racing the same cold key:

Pod A: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = X
Pod B: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 2X
Pod C: Redis miss → read DB (spend = X) → INCRBYFLOAT +X → Redis = 3X
...

Subsequent get_current_spend() calls in user_api_key_auth → common_checks read the inflated Redis value (Redis is preferred over the in-memory and DB paths) and raise BudgetExceededError.

Notably, the sibling function coalesced_window() in the same file already uses the correct idempotent pattern: async_set_cache(..., nx=True) and, on a lost race, read the winner's value. Only the non-window coalesced() path is affected.

Expected behavior

After any number of concurrent reseeds across any number of pods, the Redis spend counter must equal the authoritative DB spend (plus any genuinely concurrent response-cost increments) — and must never equal an integer multiple of the DB spend.

Symptoms an operator can check

  • GET <namespace>:spend:<scope>:<id> returns a value that is approximately an integer multiple (2×, 3×, 4×, …) of the entity's true DB spend.
  • BudgetExceededError raised at request auth time with a current_spend suspiciously close to a multiple of the entity's DB spend.
  • After the key's TTL elapses and a single pod wins the next rebuild, the value can drop back to the true DB spend (or to a single response cost), so the inflation looks intermittent.

Steps to Reproduce

  1. Deploy at least 2 LiteLLM proxy pods sharing the same Redis and the same Postgres backend.
  2. Pick an entity (key / team / team member / user / org) with non-zero spend in DB.
  3. Delete the corresponding <namespace>:spend:<scope>:<id> Redis key (or wait for its TTL to expire — default 60s when default_redis_ttl is unset).
  4. Concurrently send requests across pods that exercise the budget read path for that entity.
  5. GET <namespace>:spend:<scope>:<id> afterwards returns approximately N × db_spend.

A self-contained pure-Python regression test (no Redis / Postgres required) is included in the accompanying PR. It simulates N pods by giving each concurrent caller its own asyncio.Lock (mirroring the real-world condition where the in-process singleflight lock provides no cross-pod coordination) and asserts the final Redis value equals db_spend, not N × db_spend.

Relevant log output

litellm.proxy._types.BudgetExceededError: Budget has been exceeded! ...
  current_spend ≈ N × db_spend  (N = number of pods that raced the reseed)

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

main (regression introduced by #26459).

Twitter / LinkedIn details

N/A

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING