litellm - 💡(How to fix) Fix [BUG] Virtual key BudgetExceededError uses stale spend while /key/info shows spend below max_budget

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Runtime from traceback: Python 3.13 in /app/.venv/lib/python3.13/site-packages/litellm/.... The actual error was only visible under metadata.error_information: For these failed logs, /spend/logs/v2 returns top-level error fields as null: But the real error exists at:

Code Example

{
  "status": "healthy",
  "db": "connected",
  "cache": "redis",
  "litellm_version": "1.84.0",
  "use_aiohttp_transport": true,
  "log_level": "WARNING"
}

---

{
  "key_alias": "<service>-Dev",
  "team_id": "<team-id>",
  "models": [
    "openrouter/qwen/qwen3.5-27b",
    "openrouter/qwen/qwen3.6-flash"
  ],
  "max_budget": 10.0,
  "budget_duration": "30d",
  "budget_reset_at": "2026-06-01T00:00:00Z",
  "blocked": null,
  "soft_budget_cooldown": false
}

---

{
  "team_alias": "<team>",
  "spend": 3.546499379999998,
  "max_budget": 100.0,
  "budget_duration": "30d"
}

---

{
  "status": "failure",
  "model": "openrouter/qwen/qwen3.5-27b",
  "spend": 0.0,
  "total_tokens": 0,
  "request_duration_ms": 0,
  "api_base": "",
  "model_id": ""
}

---

{
  "metadata": {
    "error_information": {
      "error_code": "429",
      "error_class": "BudgetExceededError",
      "llm_provider": "",
      "error_message": "Budget has been exceeded! Current cost: 10.607015384999993, Max budget: 10.0"
    },
    "user_api_key_alias": "<service>-Dev",
    "user_api_key_team_alias": "<team>",
    "user_api_key_team_id": "<team-id>"
  }
}

---

{
  "key_alias": "<service>-Dev",
  "spend": 3.536490479999998,
  "max_budget": 10.0,
  "budget_duration": "30d",
  "budget_reset_at": "2026-06-01T00:00:00+00:00",
  "blocked": null,
  "soft_budget_cooldown": false
}

---

{
  "error_code": null,
  "error_message": null,
  "error_class": null
}

---

metadata.error_information.error_code
metadata.error_information.error_class
metadata.error_information.error_message

---

{
  "total_failures": 15,
  "by_error_class": [
    {"error_class": "APIError", "count": 2},
    {"error_class": "BudgetExceededError", "count": 13}
  ],
  "by_model": [
    {"model": "openrouter/qwen/qwen3.5-27b", "count": 15}
  ]
}
RAW_BUFFERClick to expand / collapse

What happened?

We saw failed proxy requests for a team-scoped virtual key where LiteLLM rejected the request with BudgetExceededError, but the current key/team spend reported by management APIs was still below the configured budget.

This looks related to #27639, but this case is for a virtual key max_budget check rather than an end-user budget.

Relevant LiteLLM version

1.84.0

From /health/readiness:

{
  "status": "healthy",
  "db": "connected",
  "cache": "redis",
  "litellm_version": "1.84.0",
  "use_aiohttp_transport": true,
  "log_level": "WARNING"
}

Runtime from traceback: Python 3.13 in /app/.venv/lib/python3.13/site-packages/litellm/....

Setup

LiteLLM proxy with:

  • Postgres DB connected
  • Redis cache connected
  • Team-scoped virtual keys
  • budget_duration: 30d
  • OpenRouter model group

Example sanitized virtual key config:

{
  "key_alias": "<service>-Dev",
  "team_id": "<team-id>",
  "models": [
    "openrouter/qwen/qwen3.5-27b",
    "openrouter/qwen/qwen3.6-flash"
  ],
  "max_budget": 10.0,
  "budget_duration": "30d",
  "budget_reset_at": "2026-06-01T00:00:00Z",
  "blocked": null,
  "soft_budget_cooldown": false
}

The team budget was much higher:

{
  "team_alias": "<team>",
  "spend": 3.546499379999998,
  "max_budget": 100.0,
  "budget_duration": "30d"
}

Actual behavior

/spend/logs/v2 showed 15 failures for the virtual key in one day. 13/15 failures were BudgetExceededError before the LLM call was made.

All failures had:

{
  "status": "failure",
  "model": "openrouter/qwen/qwen3.5-27b",
  "spend": 0.0,
  "total_tokens": 0,
  "request_duration_ms": 0,
  "api_base": "",
  "model_id": ""
}

The actual error was only visible under metadata.error_information:

{
  "metadata": {
    "error_information": {
      "error_code": "429",
      "error_class": "BudgetExceededError",
      "llm_provider": "",
      "error_message": "Budget has been exceeded! Current cost: 10.607015384999993, Max budget: 10.0"
    },
    "user_api_key_alias": "<service>-Dev",
    "user_api_key_team_alias": "<team>",
    "user_api_key_team_id": "<team-id>"
  }
}

A few minutes/hours later, management APIs showed the same key was not over budget:

{
  "key_alias": "<service>-Dev",
  "spend": 3.536490479999998,
  "max_budget": 10.0,
  "budget_duration": "30d",
  "budget_reset_at": "2026-06-01T00:00:00+00:00",
  "blocked": null,
  "soft_budget_cooldown": false
}

So the failed request saw Current cost: 10.48-10.60, while /key/info and /key/list later showed spend around 3.536 / 10.

Expected behavior

The virtual key budget check should be consistent with the spend shown by /key/info / /key/list, or the API should expose which Redis/reservation counter is being used for the budget decision.

If a stale Redis reservation/counter can trigger the check, it should be finalized/decremented reliably so a virtual key is not blocked while its DB/current spend is below budget.

Additional logging/API issue

For these failed logs, /spend/logs/v2 returns top-level error fields as null:

{
  "error_code": null,
  "error_message": null,
  "error_class": null
}

But the real error exists at:

metadata.error_information.error_code
metadata.error_information.error_class
metadata.error_information.error_message

This makes it hard for the UI/API consumers to filter or summarize failures unless they know to inspect nested metadata.

Expected: top-level error_code, error_class, and error_message should be populated from metadata.error_information for failed proxy requests, or the docs/API response schema should make this nesting explicit.

Failure summary

For the affected key on one day:

{
  "total_failures": 15,
  "by_error_class": [
    {"error_class": "APIError", "count": 2},
    {"error_class": "BudgetExceededError", "count": 13}
  ],
  "by_model": [
    {"model": "openrouter/qwen/qwen3.5-27b", "count": 15}
  ]
}

The two APIError entries were unrelated OpenRouter parse errors. The recurring issue is the 13 virtual-key BudgetExceededError failures.

Impact

Production requests from the dev virtual key were rejected with HTTP 429 despite current key/team spend being below budget, causing intermittent service failures and confusing failure logs.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The virtual key budget check should be consistent with the spend shown by /key/info / /key/list, or the API should expose which Redis/reservation counter is being used for the budget decision.

If a stale Redis reservation/counter can trigger the check, it should be finalized/decremented reliably so a virtual key is not blocked while its DB/current spend is below budget.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [BUG] Virtual key BudgetExceededError uses stale spend while /key/info shows spend below max_budget