litellm - 💡(How to fix) Fix [Bug]: Grace period — old API key rejected immediately despite row in LiteLLM_DeprecatedVerificationToken (Enterprise, proxy v1.82.3) [2 comments, 2 participants]

Error Message

Proxy log lines from kubectl logs … filtered for deprecated|grace|regenerate|rotation|error (timestamps from one run; pod name may differ): [pod/.../litellm] {"timestamp":"...","level":"ERROR","fields":{"message":"Error in PostgreSQL connection: Error { kind: Closed, cause: None }"},"target":"quaint::connector::postgres"} {"error":{"message":"Key not found in database","type":"not_found_error","param":"key","code":"404"

Code Example

export DEV_URL="https://<litellm-host>/"
export PROXY_MASTER_KEY='<redacted-master-key>'

---

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $PROXY_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -X POST "$DEV_URL/key/generate" \
  -d '{"models":["openai-gpt-4.1-nano-2025-04-14"],"metadata":{"purpose":"rotation-test"},"key_alias":"rotation-thorough-litellm-investigation","max_budget":1.0}'

export OLD_KEY='sk-OLD_REDACTED'

---

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $OLD_KEY" \
  -H "Content-Type: application/json" \
  "$DEV_URL/v1/chat/completions" \
  -d '{"model":"openai-gpt-4.1-nano-2025-04-14","messages":[{"role":"user","content":"Say hello"}],"max_tokens":10}'

---

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $PROXY_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -X POST "$DEV_URL/key/$OLD_KEY/regenerate" \
  -d '{"grace_period":"1h","metadata":{"purpose":"rotation-thorough-after"}}'
export NEW_KEY='sk-NEW_REDACTED'   # from response "key" field

---

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $OLD_KEY" \
  -H "Content-Type: application/json" \
  "$DEV_URL/v1/chat/completions" \
  -d '{"model":"openai-gpt-4.1-nano-2025-04-14","messages":[{"role":"user","content":"Say hello"}],"max_tokens":10}'

---

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $NEW_KEY" \
  -H "Content-Type: application/json" \
  "$DEV_URL/v1/chat/completions" \
  -d '{"model":"openai-gpt-4.1-nano-2025-04-14","messages":[{"role":"user","content":"Say hello"}],"max_tokens":10}'

---

curl -sS "$DEV_URL/key/info?key=$OLD_KEY" -H "Authorization: Bearer $PROXY_MASTER_KEY"

---

Proxy log lines from kubectl logs … filtered for deprecated|grace|regenerate|rotation|error (timestamps from one run; pod name may differ):
[pod/.../litellm] {"message": "Key regeneration requested: key_alias=rotation-thorough-litellm-investigation", ...}
[pod/.../litellm] {"message": "10.x.x.x:x - \"POST /key/sk-...REDACTED/regenerate HTTP/1.1\" 200", ...}
Unfiltered lines around the same window also showed transient DB connection errors from the Prisma stack (may be unrelated noise):
[pod/.../litellm] {"timestamp":"...","level":"ERROR","fields":{"message":"Error in PostgreSQL connection: Error { kind: Closed, cause: None }"},"target":"quaint::connector::postgres"}
API response for /key/info?key=$OLD_KEY after regenerate (body):
{"error":{"message":"Key not found in database","type":"not_found_error","param":"key","code":"404"

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

A bug happened!

Edition: LiteLLM Enterprise. We need a clear fix or upgrade path

Image: [ghcr.io/berriai/litellm-database:main-v1.82.3-stable](http://ghcr.io/berriai/litellm-database:main-v1.82.3-stable%60) (mirrored to ECR).

Per Virtual Keys → Key Rotations, POST /key/{key}/regenerate with "grace_period": "…" should keep both the old and new virtual keys working until revoke_at.

Observed: Immediately after regenerate, requests using the pre-rotate sk-… fail with auth errors consistent with token_not_found_in_db / “Unable to find token in cache or LiteLLM_VerificationTokenTable”, while the new key works.

Database (self-hosted Postgres): After regenerate, SELECT token, active_token_id, revoke_at FROM "LiteLLM_DeprecatedVerificationToken" shows a row where:

token = SHA-256 of the deprecated (pre-rotate) key material
active_token_id = hash aligning with the new active key row
revoke_at = rotate time + grace (behaves as expected)

So the grace-period row is persisted; the mismatch appears between stored row vs runtime auth path (lookup / cache / ordering), not “table missing.”

Steps to Reproduce

Shell (values are redacted; use your own base URL, master key, and keys):

export DEV_URL="https://<litellm-host>/"
export PROXY_MASTER_KEY='<redacted-master-key>'

1) Create a disposable virtual key

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $PROXY_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -X POST "$DEV_URL/key/generate" \
  -d '{"models":["openai-gpt-4.1-nano-2025-04-14"],"metadata":{"purpose":"rotation-test"},"key_alias":"rotation-thorough-litellm-investigation","max_budget":1.0}'

export OLD_KEY='sk-OLD_REDACTED'

2) Sanity: chat with OLD_KEY → HTTP 200

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $OLD_KEY" \
  -H "Content-Type: application/json" \
  "$DEV_URL/v1/chat/completions" \
  -d '{"model":"openai-gpt-4.1-nano-2025-04-14","messages":[{"role":"user","content":"Say hello"}],"max_tokens":10}'

3) Regenerate with grace (master auth)

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $PROXY_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -X POST "$DEV_URL/key/$OLD_KEY/regenerate" \
  -d '{"grace_period":"1h","metadata":{"purpose":"rotation-thorough-after"}}'
export NEW_KEY='sk-NEW_REDACTED'   # from response "key" field

4) Immediately: chat with OLD_KEY → 401 (unexpected per docs)

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $OLD_KEY" \
  -H "Content-Type: application/json" \
  "$DEV_URL/v1/chat/completions" \
  -d '{"model":"openai-gpt-4.1-nano-2025-04-14","messages":[{"role":"user","content":"Say hello"}],"max_tokens":10}'

5) Chat with NEW_KEY → 200

curl -sS -w "\nHTTP:%{http_code}\n" \
  -H "Authorization: Bearer $NEW_KEY" \
  -H "Content-Type: application/json" \
  "$DEV_URL/v1/chat/completions" \
  -d '{"model":"openai-gpt-4.1-nano-2025-04-14","messages":[{"role":"user","content":"Say hello"}],"max_tokens":10}'

6) /key/info for old key → 404 (observed in our run)

curl -sS "$DEV_URL/key/info?key=$OLD_KEY" -H "Authorization: Bearer $PROXY_MASTER_KEY"

Relevant log output

Proxy log lines from kubectl logs … filtered for deprecated|grace|regenerate|rotation|error (timestamps from one run; pod name may differ):
[pod/.../litellm] {"message": "Key regeneration requested: key_alias=rotation-thorough-litellm-investigation", ...}
[pod/.../litellm] {"message": "10.x.x.x:x - \"POST /key/sk-...REDACTED/regenerate HTTP/1.1\" 200", ...}
Unfiltered lines around the same window also showed transient DB connection errors from the Prisma stack (may be unrelated noise):
[pod/.../litellm] {"timestamp":"...","level":"ERROR","fields":{"message":"Error in PostgreSQL connection: Error { kind: Closed, cause: None }"},"target":"quaint::connector::postgres"}
API response for /key/info?key=$OLD_KEY after regenerate (body):
{"error":{"message":"Key not found in database","type":"not_found_error","param":"key","code":"404"

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.3 (main-v1.82.3-stable image)

Twitter / LinkedIn details

extent analysis

TL;DR

The issue can be resolved by ensuring the authentication cache is properly updated after key regeneration, considering the grace period.

Guidance

Verify that the revoke_at timestamp is correctly set for the deprecated key, allowing for the specified grace period.
Check the authentication cache to ensure it is not prematurely expiring the old key, causing the 401 errors.
Review the database connection errors from the Prisma stack to determine if they are related to the issue or just noise.
Test the key regeneration process with a longer grace period to see if the issue persists.

Example

No code snippet is provided as the issue seems to be related to the authentication flow and cache management, which requires a deeper understanding of the LiteLLM implementation.

Notes

The issue might be specific to the LiteLLM Proxy version (v1.82.3) and the self-hosted Postgres database configuration. Further investigation is needed to determine the root cause.

Recommendation

Apply a workaround by adjusting the authentication cache settings or the key regeneration process to ensure a smoother transition during the grace period. This might involve updating the cache expiration time or implementing a more robust key rotation mechanism.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix [Bug]: Grace period — old API key rejected immediately despite row in LiteLLM_DeprecatedVerificationToken (Enterprise, proxy v1.82.3) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Check for existing issues

What happened?

Steps to Reproduce

1) Create a disposable virtual key

2) Sanity: chat with OLD_KEY → HTTP 200

3) Regenerate with grace (master auth)

4) Immediately: chat with OLD_KEY → 401 (unexpected per docs)

5) Chat with NEW_KEY → 200

6) /key/info for old key → 404 (observed in our run)

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix [Bug]: Grace period — old API key rejected immediately despite row in LiteLLM_DeprecatedVerificationToken (Enterprise, proxy v1.82.3) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Check for existing issues

What happened?

Steps to Reproduce

1) Create a disposable virtual key

2) Sanity: chat with OLD_KEY → HTTP 200

3) Regenerate with grace (master auth)

4) Immediately: chat with OLD_KEY → 401 (unexpected per docs)

5) Chat with NEW_KEY → 200

6) /key/info for old key → 404 (observed in our run)

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING