litellm - 💡(How to fix) Fix Tag-budget enforcement silently skipped on x-litellm-tags header path

litellm2026-05-08 17:48:58

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

When a client passes tags via the documented x-litellm-tags: <tag> HTTP header (with the issuing key having metadata.allow_client_tags: true), the per-tag budget gate at _tag_max_budget_check does not enforce — even after the tag's accumulated spend exceeds max_budget. Requests are silently allowed through with HTTP 200. Passing the same tags via the request body ({"tags": [...]} or {"metadata": {"tags": [...]}}) works correctly and returns HTTP 400 budget_exceeded.

Error Message

Expected: HTTP 400 with {"error": {"type": "budget_exceeded", "message": "Budget has been exceeded! Tag=tenant:acme ...", ...}}

Root Cause

The auth chain at litellm/proxy/auth/user_api_key_auth.py calls common_checks which calls _tag_max_budget_check (litellm/proxy/auth/auth_checks.py:3487). That function calls get_tags_from_request_body(request_body) (litellm/proxy/common_utils/http_parsing_utils.py:418), which only reads request_body["tags"] and request_body["metadata"]["tags"].

The x-litellm-tags header is parsed and merged into data[metadata]["tags"] by LiteLLMProxyRequestSetup.add_request_tag_to_metadata (litellm/proxy/litellm_pre_call_utils.py:872) — but that runs inside add_litellm_data_to_request, after the auth chain has already completed.

Result: on header-tagged requests, get_tags_from_request_body returns [], the tag-budget loop iterates over nothing, and _tag_max_budget_check returns silently. Spend tracking is unaffected (it runs post-call), but enforcement is bypassed.

Fix Action

Fix / Workaround

Tested on v1.83.14-stable.patch.2 (upstream image, fresh docker-compose with bundled Postgres + Redis).

Code Example

curl -X POST http://localhost:4000/key/generate \
     -H "Authorization: Bearer ${MASTER_KEY}" \
     -d '{"metadata": {"tags": ["caller-id:probe"], "allow_client_tags": true}}'

---

curl -X POST http://localhost:4000/budget/new -H "Authorization: Bearer ${MASTER_KEY}" \
     -d '{"max_budget": 0.10, "budget_duration": "1d"}'
   # → returns budget_id=B
   curl -X POST http://localhost:4000/tag/new -H "Authorization: Bearer ${MASTER_KEY}" \
     -d '{"name": "tenant:acme", "budget_id": "B"}'

---

curl -X POST http://localhost:4000/v1/chat/completions \
     -H "Authorization: Bearer ${TEST_KEY}" \
     -H "x-litellm-tags: tenant:acme" \
     -d '{"model": "...", "messages": [...]}'

---

async def _tag_max_budget_check(
    request_body: dict,
    request_headers: Optional[dict],   # <-- new
    prisma_client: ...,
    ...
):
    tags = get_tags_from_request_body(request_body=request_body)
    if request_headers and request_headers.get("x-litellm-tags"):
        header_tags = [t.strip() for t in request_headers["x-litellm-tags"].split(",")]
        tags = list({*tags, *header_tags})  # union, preserve dedup
    if not tags:
        return
    # ... existing loop

RAW_BUFFERClick to expand / collapse

Summary

Reproduction

Tested on v1.83.14-stable.patch.2 (upstream image, fresh docker-compose with bundled Postgres + Redis).

Generate a virtual key:

curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer ${MASTER_KEY}" \
  -d '{"metadata": {"tags": ["caller-id:probe"], "allow_client_tags": true}}'

Create a budget + tag:

curl -X POST http://localhost:4000/budget/new -H "Authorization: Bearer ${MASTER_KEY}" \
  -d '{"max_budget": 0.10, "budget_duration": "1d"}'
# → returns budget_id=B
curl -X POST http://localhost:4000/tag/new -H "Authorization: Bearer ${MASTER_KEY}" \
  -d '{"name": "tenant:acme", "budget_id": "B"}'

Drive enough traffic with header-passed tags to exceed $0.10:

curl -X POST http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer ${TEST_KEY}" \
  -H "x-litellm-tags: tenant:acme" \
  -d '{"model": "...", "messages": [...]}'

After spend (visible via /spend/tags) exceeds max_budget, send another request with the same key + header.

Expected: HTTP 400 with {"error": {"type": "budget_exceeded", "message": "Budget has been exceeded! Tag=tenant:acme ...", ...}} Actual: HTTP 200, request processed normally. LiteLLM_TagTable.spend continues to grow unbounded.

If you change step 3 to send tags in the body (-d '{"tags": ["tenant:acme"], ...}'), step 4 correctly returns HTTP 400. So the enforcement function works; it just isn't seeing the header tags.

Root cause

Suggested fix

Two equivalent approaches:

Option A (minimal): read the x-litellm-tags header directly in _tag_max_budget_check and union with body tags before evaluating. Keeps the existing call chain.

Option B (cleaner): move add_request_tag_to_metadata (or a header-only variant of it) to run before common_checks, so request_body["metadata"]["tags"] is already merged when budget gates fire.

Option A localizes the change to one function and one file. Sample:

async def _tag_max_budget_check(
    request_body: dict,
    request_headers: Optional[dict],   # <-- new
    prisma_client: ...,
    ...
):
    tags = get_tags_from_request_body(request_body=request_body)
    if request_headers and request_headers.get("x-litellm-tags"):
        header_tags = [t.strip() for t in request_headers["x-litellm-tags"].split(",")]
        tags = list({*tags, *header_tags})  # union, preserve dedup
    if not tags:
        return
    # ... existing loop

Caller in common_checks already has access to the request, so threading headers through is local.

Test gap

tests/local_testing/test_router_budget_limiter.py::test_tag_budgets_e2e_test_expect_to_fail is the only enforcement test for tag budgets, and it passes tags via metadata={"tags": [...]} (body path), which is why CI didn't catch this. A header-path test should be added alongside the fix.

Severity

High for any deployment that uses x-litellm-tags for tenant/customer identification (the documented pattern in the proxy docs). Tag budgets configured via that path silently fail open — spend grows unbounded and ops only finds out from billing.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #chain error #conversation history #tool integration #LLM response

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix Tag-budget enforcement silently skipped on x-litellm-tags header path

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Reproduction

Root cause

Suggested fix

Test gap

Severity

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix Tag-budget enforcement silently skipped on x-litellm-tags header path

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Reproduction

Root cause

Suggested fix

Test gap

Severity

Still need to ship something?

RELATED_DISCOVERY

TRENDING