litellm - ✅(Solved) Fix [Bug] custom_llm_provider not propagated to budget_limiter.async_log_success_event for /v1/messages + /v1/embeddings [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#26701Fetched 2026-04-29 06:12:38
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1labeled ×1

Error Message

When provider_budget_config is enabled (e.g., anthropic: 5.0/24h), every call to /v1/messages (Anthropic format) and /v1/embeddings triggers a ValueError in the budget-limiter callback. Calls succeed (200 OK), but stderr floods with traceback from budget_limiter.async_log_success_event.

Fix Action

Workaround

Disable provider_budget_config entirely. This loses the hard-cap protection but stops the spam. We replaced it with a Prometheus alert on litellm_spend_metric_total as a soft-warning fallback.

PR fix notes

PR #553: Add LiteLLM section to Other group (3 alerting rules)

Description (problem / solution / changelog)

Context

LiteLLM is a widely-used LLM-gateway/proxy that exposes Prometheus metrics via its built-in callback. Currently there's no LiteLLM section in this repo, despite its adoption as an OpenAI/Anthropic-compatible proxy in many production stacks.

What this PR adds

3 alerting rules under a new LiteLLM service in the Other group:

  1. LiteLLM provider spend over budget — soft-warning on cumulative 24h spend per model-name regex. Useful when LiteLLM's native provider_budget_config hard-cap is unavailable, disabled, or buggy (we hit such a bug, see BerriAI/litellm#26701).

  2. LiteLLM proxy failed requests rate high — error-rate ratio alert for downstream LLM provider availability/auth issues.

  3. LiteLLM request latency p95 high — histogram-quantile alert for downstream provider response-time degradation.

Validation

  • All 3 rules: promtool check rules returns SUCCESS: 3 rules found.
  • Validated on a real LiteLLM v1.83.7 production deployment.
  • The spend rule (AnthropicSpend24hOverBudget in our deployment) was end-to-end tested via real haiku-call → alert fires → Telegram-routed → resolved post-revert.

Notes for reviewers

  • The (claude-|anthropic/).* regex in the spend-rule example is just one provider-pattern; users will customize for their own providers (openai-, gpt-, gemini-, etc.). The description explicitly notes this.
  • The spend-counter has a known first-value-problem on brand-new series (PromQL increase() needs ≥2 datapoints with growth-difference). Documented in the rule's comments: field.
  • All 3 metrics (litellm_spend_metric_total, litellm_proxy_failed_requests_metric_total, litellm_proxy_total_requests_metric_total, litellm_request_total_latency_metric_bucket) are exposed by LiteLLM's built-in prometheus callback (no separate exporter needed).

Reference

Changed files

  • _data/rules.yml (modified, +25/-0)

Code Example

litellm_settings:
  callbacks: ["prometheus"]

provider_budget_config:
  anthropic:
    budget_limit: 5.0
    time_period: "24h"

model_list:
  - model_name: claude-haiku-4-5-direct-anthropic
    litellm_params:
      model: anthropic/claude-haiku-4-5-20251001
      api_key: os.environ/ANTHROPIC_API_KEY

---

curl -X POST http://litellm:4000/v1/messages \
  -H "x-api-key: $LITELLM_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-haiku-4-5-direct-anthropic","max_tokens":10,"messages":[{"role":"user","content":"hi"}]}'
RAW_BUFFERClick to expand / collapse

Bug

When provider_budget_config is enabled (e.g., anthropic: 5.0/24h), every call to /v1/messages (Anthropic format) and /v1/embeddings triggers a ValueError in the budget-limiter callback. Calls succeed (200 OK), but stderr floods with traceback from budget_limiter.async_log_success_event.

Reproducer

LiteLLM v1.83.7, config:

litellm_settings:
  callbacks: ["prometheus"]

provider_budget_config:
  anthropic:
    budget_limit: 5.0
    time_period: "24h"

model_list:
  - model_name: claude-haiku-4-5-direct-anthropic
    litellm_params:
      model: anthropic/claude-haiku-4-5-20251001
      api_key: os.environ/ANTHROPIC_API_KEY

Then call:

curl -X POST http://litellm:4000/v1/messages \
  -H "x-api-key: $LITELLM_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-haiku-4-5-direct-anthropic","max_tokens":10,"messages":[{"role":"user","content":"hi"}]}'

Returns 200 + valid response, BUT stderr emits ValueError from router_strategy/budget_limiter.py complaining about missing custom_llm_provider in the kwargs/data dict.

Same behavior for /v1/embeddings calls.

Frequency

In our production deployment: 306 ValueError tracebacks / 2 hours during normal operation (call volume ~100 req/h split between /v1/messages and /v1/embeddings).

Workaround

Disable provider_budget_config entirely. This loses the hard-cap protection but stops the spam. We replaced it with a Prometheus alert on litellm_spend_metric_total as a soft-warning fallback.

Root-cause hypothesis

The provider_budget_config callback async_log_success_event reads custom_llm_provider from data (or kwargs), but the request-routing layer for /v1/messages and /v1/embeddings does NOT inject custom_llm_provider into the kwargs the way /v1/chat/completions does. We tried adding custom_llm_provider: under litellm_params: in YAML — wirkungslos (LiteLLM reads data.get at deployment-top-level, not from litellm_params).

Distinct from existing issues

  • #24770 (UI lets model-names without provider/-prefix → budget tracking fails) — our model-config has the anthropic/-prefix correctly, bug appears on every call regardless of UI involvement.
  • #4849 (counter resets on restart) — different problem.
  • #19929 (counter +2 instead of +1) — different problem.
  • #17415 (Bedrock metrics not updating) — different problem.

Environment

  • LiteLLM proxy v1.83.7 (latest stable)
  • Deployment via systemd on Ubuntu 24.04
  • Python 3.12
  • Anthropic provider via anthropic/ prefix routing

extent analysis

TL;DR

The most likely fix is to modify the budget_limiter.async_log_success_event callback to handle cases where custom_llm_provider is missing from the kwargs.

Guidance

  • Verify that the custom_llm_provider key is indeed missing from the kwargs passed to budget_limiter.async_log_success_event by adding a debug log statement before the line that raises the ValueError.
  • Check the request-routing layer for /v1/messages and /v1/embeddings to see why custom_llm_provider is not being injected into the kwargs, and modify it to include this key if necessary.
  • Consider adding a default value or a fallback mechanism in budget_limiter.async_log_success_event to handle cases where custom_llm_provider is missing.
  • Review the litellm_params configuration to ensure that custom_llm_provider is not being overridden or ignored.

Example

# In budget_limiter.py
def async_log_success_event(self, **kwargs):
    custom_llm_provider = kwargs.get('custom_llm_provider')
    if custom_llm_provider is None:
        # Handle the case where custom_llm_provider is missing
        print("Warning: custom_llm_provider is missing from kwargs")
        # Add a default value or fallback mechanism here
    # Rest of the function remains the same

Notes

The provided workaround of disabling provider_budget_config entirely may not be desirable as it loses the hard-cap protection. The suggested modifications to budget_limiter.async_log_success_event should be tested thoroughly to ensure they do not introduce any new issues.

Recommendation

Apply a workaround by modifying the budget_limiter.async_log_success_event callback to handle missing custom_llm_provider keys, as this is a more targeted solution that addresses the root cause of the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - ✅(Solved) Fix [Bug] custom_llm_provider not propagated to budget_limiter.async_log_success_event for /v1/messages + /v1/embeddings [1 pull requests, 1 participants]