litellm - ✅(Solved) Fix [Bug]: Redis user_api_key_cache deserialization always fails for team-scoped keys (LiteLLM_UserTable.user_id required) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#27874Fetched 2026-05-14 03:30:02
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0

After upgrading to a build that includes #26202 (the CacheCodec token-verification query optimization), the Redis-backed user_api_key_cache emits a continuous stream of ERROR-level deserialization failures for team-scoped virtual keys. Functionally requests still succeed (the codec correctly treats a ValidationError as a cache miss and falls through to the DB), but every team-scoped lookup pays an extra DB round-trip and the logs are very noisy.

Error Message

After upgrading to a build that includes #26202 (the CacheCodec token-verification query optimization), the Redis-backed user_api_key_cache emits a continuous stream of ERROR-level deserialization failures for team-scoped virtual keys. Functionally requests still succeed (the codec correctly treats a ValidationError as a cache miss and falls through to the DB), but every team-scoped lookup pays an extra DB round-trip and the logs are very noisy. (1 validation error for LiteLLM_UserTable ERROR user_api_key_cache.py:126 4. Observe the ERROR/WARNING pair above repeating on every cache lookup.

  • Observability: ERROR-level logs at high cardinality; pollutes log aggregation / alerting.

Root Cause

Root cause (likely)

PR fix notes

PR #26202: Litellm token verification query optimization

Description (problem / solution / changelog)

Relevant issues

Problem

In Kubernetes or any multi-replica deployment, per-process memory cache is not shared. Requests can hit different pods, so in-memory-only user_api_key_cache does not deduplicate work across nodes and can yield inconsistent views of the same key.

Using Redis with DualCache shares cached key metadata across replicas and reduces redundant database lookupsif serialized values are JSON-safe on write and reads rehydrate to the types the proxy expects (e.g. UserAPIKeyAuth).

This can be amplified if each pod has multiple workers and since each worker is a subprocess , they dont share memory and the number of db calls in this path is num_pods X num_workers_in_each_pod X cache_miss_ratio.

Without a single encode/decode rule, Redis often returns a dict after JSON round-trip while memory may hold BaseModel instances. Code that assumes attributes (e.g. .spend) then fails with AttributeError, and spend/cache updates become fragile.

Approach

CacheCodec (litellm/proxy/common_utils/cache_pydantic_utils.py) centralizes the boundary:

  • serialize — validate + model_dump(mode="json", exclude_none=True) when writing (aligned with Redis / json.dumps).
  • deserializemodel_validate on dict hits; ValidationError → treat as miss (None) + warning log (no bad data served).

Scope note: CacheCodec is for Pydantic BaseModel + dict only; stdlib dataclasses are not supported—convert at the call site (e.g. dataclasses.asdict) or use a Pydantic model.

<!-- e.g. "Fixes #000" -->

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Screenshots / Proof of Fix

<!-- Include screenshots, screen recordings, or log output demonstrating that your changes work as expected. For bug fixes: show reproduction before the fix and passing behavior after. For new features: show the feature working end-to-end. For UI changes: include before/after screenshots. -->

Type

<!-- Select the type of Pull Request --> <!-- Keep only the necessary ones -->

🆕 New Feature 🐛 Bug Fix 🧹 Refactoring 📖 Documentation 🚄 Infrastructure ✅ Test

Changes

Code changes (summary)

AreaChange
cache_pydantic_utils.pyCacheCodec.serialize / deserialize, validation-on-read behavior, dataclass disclaimer in docstrings
proxy_server.py_update_key_cache: deserialize before attribute access; serialize(..., UserAPIKeyAuth) for pipeline writes
auth_checks.py (and related)Use CacheCodec when reading/writing cached key/team/project objects
Caching / auth guardsAvoid None Redis keys and noisy errors where payloads are missing (e.g. team id / team object)
Teststests/test_litellm/proxy/common_utils/test_cache_codec.py

Testing

StatusItem
DoneUnit tests for CacheCodec in tests/test_litellm/
DONEEnd-to-end: Proxy with Redis-backed user_api_key_cache, multiple replicas (or workers), verify auth + spend paths, no dict vs model errors in logs
*DONEOne efficacy check: Prove Redis round-trip—write JSON-safe payload, read from another process / cold memory, assert behavior matches single-node baseline (e.g. key spend update or get_key_object)

Load Test (4 Pods)

Without Redis — combined view only (SELECT v.* … FROM "LiteLLM_VerificationToken" AS v …)

user_api_key_cache in-memory TTL: 60 seconds (UserAPIKeyCacheTTLEnum; no general_settings.user_api_key_cache_ttl override unless you set one).

Windowcallstotal_msavg_msrows
5 min1249269.350.221249
10 min1858394.990.211858

With Redis — combined view only (same query)

With Redis enabled, the same default 60s TTL is applied to both in-memory and Redis when user_api_key_cache_ttl is not set (proxy_server mirrors TTL on both layers).

Windowcallstotal_msavg_msrows
5 min5235.360.6851
10 min9335.360.6851

Locust Results

  • 15 different virtual keys
  • 200 concurrent requests, with a 10 second ramp up
  • The left is without the redis cache and the right is with the cache
  • 8 instances / pods of litellm run
<img width="3016" height="1819" alt="image" src="https://github.com/user-attachments/assets/580b1d4a-9531-478e-8161-2f1eb5c356c9" />

Changed files

  • .gitignore (modified, +2/-2)
  • litellm/caching/dual_cache.py (modified, +19/-0)
  • litellm/caching/redis_cache.py (modified, +9/-1)
  • litellm/integrations/prometheus.py (modified, +1/-0)
  • litellm/proxy/auth/auth_checks.py (modified, +165/-133)
  • litellm/proxy/auth/handle_jwt.py (modified, +16/-12)
  • litellm/proxy/auth/user_api_key_auth.py (modified, +35/-23)
  • litellm/proxy/common_utils/cache_coordinator.py (modified, +22/-3)
  • litellm/proxy/common_utils/cache_pydantic_utils.py (added, +93/-0)
  • litellm/proxy/common_utils/expired_ui_session_key_cleanup_manager.py (modified, +2/-2)
  • litellm/proxy/common_utils/user_api_key_cache.py (added, +162/-0)
  • litellm/proxy/management_endpoints/access_group_endpoints.py (modified, +9/-11)
  • litellm/proxy/management_endpoints/key_management_endpoints.py (modified, +10/-10)
  • litellm/proxy/management_endpoints/ui_sso.py (modified, +11/-10)
  • litellm/proxy/management_helpers/team_member_permission_checks.py (modified, +2/-2)
  • litellm/proxy/proxy_server.py (modified, +124/-81)
  • litellm/proxy/utils.py (modified, +3/-2)
  • tests/proxy_unit_tests/test_auth_checks.py (modified, +9/-2)
  • tests/proxy_unit_tests/test_user_api_key_auth.py (modified, +6/-1)
  • tests/test_litellm/caching/test_dual_cache.py (modified, +70/-0)
  • tests/test_litellm/proxy/auth/test_auth_checks.py (modified, +19/-17)
  • tests/test_litellm/proxy/auth/test_handle_jwt.py (modified, +6/-4)
  • tests/test_litellm/proxy/common_utils/test_cache_codec.py (added, +126/-0)
  • tests/test_litellm/proxy/common_utils/test_user_api_key_cache.py (added, +219/-0)
  • tests/test_litellm/proxy/management_endpoints/test_access_group_endpoints.py (modified, +43/-14)
  • tests/test_litellm/proxy/management_endpoints/test_team_endpoints.py (modified, +3/-0)
  • tests/test_litellm/proxy/test_redis_auth_cache_flag.py (added, +145/-0)

Code Example

WARNING cache_pydantic_utils.py:87
  CacheCodec.deserialize: validation failed for LiteLLM_UserTable
  (1 validation error for LiteLLM_UserTable
   user_id
     Field required [type=missing, input_value={'team_alias': '<uuid>', ..., 'teams': []}, input_type=dict])

ERROR user_api_key_cache.py:126
  UserApiKeyCache.async_get_cache failed to deserialize cached value
  for key='<uuid>' model_type=LiteLLM_UserTable

---

general_settings:
  user_api_key_cache_ttl: 5
RAW_BUFFERClick to expand / collapse

Version

litellm/litellm-database:1.84.0-rc.1

Summary

After upgrading to a build that includes #26202 (the CacheCodec token-verification query optimization), the Redis-backed user_api_key_cache emits a continuous stream of ERROR-level deserialization failures for team-scoped virtual keys. Functionally requests still succeed (the codec correctly treats a ValidationError as a cache miss and falls through to the DB), but every team-scoped lookup pays an extra DB round-trip and the logs are very noisy.

Symptom

Repeating pair of log lines, once per team-scoped request:

WARNING cache_pydantic_utils.py:87
  CacheCodec.deserialize: validation failed for LiteLLM_UserTable
  (1 validation error for LiteLLM_UserTable
   user_id
     Field required [type=missing, input_value={'team_alias': '<uuid>', ..., 'teams': []}, input_type=dict])

ERROR user_api_key_cache.py:126
  UserApiKeyCache.async_get_cache failed to deserialize cached value
  for key='<uuid>' model_type=LiteLLM_UserTable

The input_value snippet shows team_alias is set but user_id is absent. The same handful of UUIDs cycles in the logs — these are our team-scoped virtual keys (no associated user).

Root cause (likely)

Inconsistency introduced by CacheCodec (litellm/proxy/common_utils/cache_pydantic_utils.py):

  • Writer: CacheCodec.serialize uses model_dump(mode=\"json\", exclude_none=True).
  • Reader: CacheCodec.deserialize uses model_validate, and LiteLLM_UserTable.user_id is declared required.

For any cached LiteLLM_UserTable instance where user_id is None — i.e. team-scoped API keys with no associated user — the writer strips the field via exclude_none=True, then the reader rejects the cached payload because user_id is required. The entry is guaranteed to fail deserialization on every read, for the lifetime of the key.

Reproduction

  1. Deploy LiteLLM proxy ≥ the build containing #26202 with Redis-backed user_api_key_cache (multi-replica).
  2. Create a virtual key scoped to a team with no user (so the underlying LiteLLM_UserTable row has user_id = NULL).
  3. Send requests using that key.
  4. Observe the ERROR/WARNING pair above repeating on every cache lookup.

Impact

  • Functional: none. Requests succeed; the codec treats the validation failure as a miss.
  • Operational: every team-scoped auth check round-trips to Postgres instead of being served from Redis — defeats the optimization #26202 was designed to deliver for exactly this multi-replica scenario.
  • Observability: ERROR-level logs at high cardinality; pollutes log aggregation / alerting.

Suggested fixes (for discussion)

One of:

  1. Make LiteLLM_UserTable.user_id Optional[str] = None for the cache model (it can legitimately be None for team-scoped keys).
  2. Have CacheCodec.serialize drop exclude_none=True for models with required fields, so absent values round-trip as None rather than missing.
  3. Use model_dump(mode=\"json\") without exclude_none for LiteLLM_UserTable specifically, or have the reader tolerate missing optional-by-semantics fields.

Configuration

general_settings:
  user_api_key_cache_ttl: 5

(Redis configured via REDIS_HOST / REDIS_PORT env vars; standard Bitnami redis chart deployed in-cluster.)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - ✅(Solved) Fix [Bug]: Redis user_api_key_cache deserialization always fails for team-scoped keys (LiteLLM_UserTable.user_id required) [1 pull requests, 1 participants]