litellm - 💡(How to fix) Fix reserve_budget_for_request() leaks Redis spend counters — phantom BudgetExceededError after upgrade to LiteLLM v1.84.0. (despite the image tag being v1.83.10.dev.2)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

{"error": {"message": "Budget has been exceeded! EndUser=347727 Current cost: 52.195456, Max budget: 50.0", "code": "429"}} Error traceback

Fix Action

Fix / Workaround

Workarounds attempted

Code Example

general_settings:
  database_url: os.environ/DATABASE_URL
  master_key: os.environ/LITELLM_MASTER_KEY
  max_end_user_budget_id: default-daily-user-budget
  use_redis_transaction_buffer: true
  cache: true
  cache_params:
    type: redis
    host: os.environ/REDIS_HOST
    port: os.environ/REDIS_PORT
    password: os.environ/REDIS_PASSWORD
    ttl: 60

litellm_settings:
  callbacks:
    - prometheus
    - alloy_loki
  cache: true
  cache_params:
    type: redis
    host: os.environ/REDIS_HOST
    port: os.environ/REDIS_PORT
    password: os.environ/REDIS_PASSWORD
    ttl: 14400

---

curl -X POST "https://<proxy>/v1/chat/completions" \
  -H "Authorization: Bearer sk-<key>" \
  -H "x-litellm-end-user-id: 347727" \
  -H "Content-Type: application/json" \
  -d '{"model": "vertex_ai/gemini-3.1-flash-lite", "messages": [{"role": "user", "content": "hi"}], "max_tokens": 5}'

---

{"error": {"message": "Budget has been exceeded! EndUser=347727 Current cost: 52.195456, Max budget: 50.0", "code": "429"}}

---

File "litellm/proxy/auth/user_api_key_auth.py", line 2081, in user_api_key_auth
    await _run_centralized_common_checks(...)
File "litellm/proxy/auth/user_api_key_auth.py", line 1958, in _run_centralized_common_checks
    await _reserve_budget_after_common_checks(...)
File "litellm/proxy/auth/user_api_key_auth.py", line 2002, in _reserve_budget_after_common_checks
    user_api_key_auth_obj.budget_reservation = await reserve_budget_for_request(...)
File "litellm/proxy/spend_tracking/budget_reservation.py", line 99, in reserve_budget_for_request
    reservation_cost = await _get_smallest_remaining_budget(...)
File "litellm/proxy/spend_tracking/budget_reservation.py", line 566, in _get_smallest_remaining_budget
    raise litellm.BudgetExceededError(
        "Budget has been exceeded! EndUser=347727 Current cost: 50.0, Max budget: 50.0"
    )

---

{
  "budget_reservation": {
    "entries": [{
      "entity_id": "177392",
      "counter_key": "spend:end_user:177392",
      "reserved_cost": 49.9012673,
      "applied_adjustment": 0
    }],
    "finalized": false,
    "reserved_cost": 49.9012673
  }
}
RAW_BUFFERClick to expand / collapse

What happened?

After upgrading from main-stable to v1.83.10.dev.2, end users are randomly getting BudgetExceededError (HTTP 429) despite their actual database spend being near $0. Some requests succeed while others randomly fail for different users, cycling approximately every 4 minutes. The Redis atomic counter spend:end_user:<id> accumulates phantom reservations that are never properly finalized/decremented.

Relevant LiteLLM version

v1.83.10.dev.2

Previous working version: main-stable (pre-v1.83.10, deployed ~18 days prior)

LiteLLM config

general_settings:
  database_url: os.environ/DATABASE_URL
  master_key: os.environ/LITELLM_MASTER_KEY
  max_end_user_budget_id: default-daily-user-budget
  use_redis_transaction_buffer: true
  cache: true
  cache_params:
    type: redis
    host: os.environ/REDIS_HOST
    port: os.environ/REDIS_PORT
    password: os.environ/REDIS_PASSWORD
    ttl: 60

litellm_settings:
  callbacks:
    - prometheus
    - alloy_loki
  cache: true
  cache_params:
    type: redis
    host: os.environ/REDIS_HOST
    port: os.environ/REDIS_PORT
    password: os.environ/REDIS_PASSWORD
    ttl: 14400

Budget configuration (default-daily-user-budget):

budget_id: default-daily-user-budget max_budget: 50.0 budget_duration: 24h Deployment: 4 replicas on Kubernetes (EKS), Redis cache enabled.

Steps to reproduce Configure max_end_user_budget_id pointing to a budget with max_budget=50.0 and budget_duration=24h Enable use_redis_transaction_buffer: true with Redis cache Deploy with multiple replicas (we use 4) Send requests with x-litellm-end-user-id header for various end users After ~4 minutes of normal traffic, random end users start failing with BudgetExceededError

curl -X POST "https://<proxy>/v1/chat/completions" \
  -H "Authorization: Bearer sk-<key>" \
  -H "x-litellm-end-user-id: 347727" \
  -H "Content-Type: application/json" \
  -d '{"model": "vertex_ai/gemini-3.1-flash-lite", "messages": [{"role": "user", "content": "hi"}], "max_tokens": 5}'

Response after ~4 min of traffic:

{"error": {"message": "Budget has been exceeded! EndUser=347727 Current cost: 52.195456, Max budget: 50.0", "code": "429"}}

Expected behavior End users should only be blocked when their actual spend reaches the $50 budget limit. Budget reservations should be properly finalized after each request.

Actual behavior Redis counter spend:end_user:<id> accumulates phantom reservations without finalization DB spend shows $0.001 while Redis counter reports $50+ Random users are blocked while others with similar usage continue working Failures cycle approximately every 4 minutes Flushing Redis keys temporarily fixes but counters re-accumulate within minutes Error traceback

File "litellm/proxy/auth/user_api_key_auth.py", line 2081, in user_api_key_auth
    await _run_centralized_common_checks(...)
File "litellm/proxy/auth/user_api_key_auth.py", line 1958, in _run_centralized_common_checks
    await _reserve_budget_after_common_checks(...)
File "litellm/proxy/auth/user_api_key_auth.py", line 2002, in _reserve_budget_after_common_checks
    user_api_key_auth_obj.budget_reservation = await reserve_budget_for_request(...)
File "litellm/proxy/spend_tracking/budget_reservation.py", line 99, in reserve_budget_for_request
    reservation_cost = await _get_smallest_remaining_budget(...)
File "litellm/proxy/spend_tracking/budget_reservation.py", line 566, in _get_smallest_remaining_budget
    raise litellm.BudgetExceededError(
        "Budget has been exceeded! EndUser=347727 Current cost: 50.0, Max budget: 50.0"
    )

Workarounds attempted

Set use_redis_transaction_buffer: false — No effect, budget_reservation.py writes to Redis independently Flushed all spend:end_user:* Redis keys — Temporarily fixes, counters re-accumulate in ~4 minutes Removed custom_prometheus_tags (was causing ValueError: Incorrect label count since PR #19717 added new labels) — Fixed Prometheus crash but budget leak persists All three combined — Counters still re-accumulate Evidence Checking end user info via API shows spend near $0:

curl -s -X GET "https://<proxy>/end_user/info?end_user_id=347727"
-H "Authorization: Bearer $LITELLM_MASTER_KEY" Returns spend: 0.001 — yet budget reservation system reports $52.19.

Successful request metadata showing unreleased reservation:

{
  "budget_reservation": {
    "entries": [{
      "entity_id": "177392",
      "counter_key": "spend:end_user:177392",
      "reserved_cost": 49.9012673,
      "applied_adjustment": 0
    }],
    "finalized": false,
    "reserved_cost": 49.9012673
  }
}

Analysis reserve_budget_for_request() (introduced in PR #24682) atomically increments the Redis counter before each LLM call, but _finalize_budget_reservation() is either not being called, failing silently, or not properly decrementing the reserved amount. The counter only goes up, hitting $50 within ~4 minutes of normal traffic.

Related issues #24675 #25386 #26233 #22019 Impact ~6000+ active end users in production. Intermittent 429 failures cycling every ~4 minutes. Users experience unpredictable blocking despite being well within their $50 daily budget.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING