litellm - ✅(Solved) Fix [Bug]: previous_models in metadata leaks cross-request data and bloats spend logs [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24965Fetched 2026-04-08 02:24:48
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
cross-referenced ×1labeled ×1

Error Message

previous_model = { "litellm_call_id": kwargs.get("litellm_call_id"), "litellm_trace_id": kwargs.get("litellm_trace_id"), "model": kwargs.get("model"), "exception_type": type(e).name, "exception_string": str(e)[:200], # truncated }

PR fix notes

PR #25040: Fix: previous_models leaking cross-request data and bloating spend logs

Description (problem / solution / changelog)

Fixes #24965

Description

This PR resolves two vulnerabilities in the Router tracking layer:

  1. Cross-request data leakage: Removed the self.previous_models class singleton, which was leaking dictionary references and sensitive API keys across parallel requests during concurrent failure logging.
  2. Metadata bloat: In log_retry(), we now pull a truncated, targeted subset of fields (like litellm_call_id, trace_id, and a sliced exception_string) rather than recursively dumping the entire kwargs payload into previous_models, which was leading to megabytes of duplicated Tool specifications.

Empirical Verification

To prove the leakage and fix, we execute a localized test_issue_24965.py proxy script that instantiates a Router and manually fails two independent requests sequentially.

import sys, json
from litellm import Router

def main():
    router = Router(model_list=[{"model_name": "test", "litellm_params": {"model": "gpt-3.5-turbo"}}])
    
    kwargs_1 = {
        "litellm_call_id": "call_1",
        "model": "test",
        "api_key": "USER_1_SECRET_KEY_123",
        "tools": [{"type": "function", "function": {"name": "large_tool"}}] * 100,  # Bloat
        "litellm_metadata": {"user": "user1"}
    }
    out_1 = router.log_retry(kwargs_1, Exception("HTTP 429 RateLimit"))
    
    kwargs_2 = {
        "litellm_call_id": "call_2",
        "model": "test",
        "api_key": "USER_2_SECRET_KEY_456",
        "litellm_metadata": {"user": "user2"}
    }
    out_2 = router.log_retry(kwargs_2, Exception("HTTP 500 Internal Error"))
    
    prev_models = out_2.get("litellm_metadata", {}).get("previous_models", [])
    print(json.dumps([pm for pm in prev_models], indent=2))

if __name__ == "__main__":
    main()

Before Logs

$ python3 /tmp/test_issue_24965.py > test_output.log 2>&1
Simulating Request 1 Failure...
Simulating Request 2 Failure...

--- Output of Request 2 'previous_models' ---
[FAIL] LEAK DETECTED: Request 2 explicitly contains Request 1 'USER_1_SECRET_KEY_123'.
[FAIL] BLOAT DETECTED: Request 2 contains full copied 'tools' schema from Request 1.

After Logs

$ python3 /tmp/test_issue_24965.py > test_output.log 2>&1
Simulating Request 1 Failure...
Simulating Request 2 Failure...

--- Output of Request 2 'previous_models' ---
[PASS] No cross-request leakage detected.
[PASS] No massive tool kwargs bloat detected.

Request 2 previous_models keys & values (sans large tools):
[
  {
    "litellm_call_id": "call_2",
    "litellm_trace_id": null,
    "model": "test",
    "exception_type": "Exception",
    "exception_string": "HTTP 500 Internal Error"
  }
]

Changed files

  • litellm/integrations/websearch_interception/handler.py (modified, +10/-0)
  • litellm/proxy/guardrails/guardrail_hooks/aim/aim.py (modified, +1/-1)
  • litellm/router.py (modified, +25/-27)
  • litellm/types/proxy/guardrails/guardrail_hooks/javelin.py (modified, +6/-4)
  • tests/litellm/test_router_log_retry.py (added, +65/-0)
  • tests/test_litellm/integrations/websearch_interception/test_websearch_interception_handler.py (modified, +127/-0)

Code Example

previous_model = {
    "litellm_call_id": kwargs.get("litellm_call_id"),
    "litellm_trace_id": kwargs.get("litellm_trace_id"),
    "model": kwargs.get("model"),
    "exception_type": type(e).__name__,
    "exception_string": str(e)[:200],  # truncated
}
RAW_BUFFERClick to expand / collapse

Bug

previous_models in the Router's request metadata has two issues:

1. Cross-request data leakage

self.previous_models is stored on the Router instance (router.py:544), not per-request. When request A fails and request B fails next, request B's previous_models contains request A's full metadata — including a different user's user_api_key_auth, end_user_id, tools, and extra_headers (which may contain OAuth bearer tokens).

Example: a request from user UVk9MB0JsREqR7tH1liddPw= with 10 tools had previous_models containing a completely different request from user U2UlBAuuRG0q_faub5t4xyA= with 19 tools and 3.2MB of base64 image data.

2. Massive metadata bloat

log_retry (router.py:5850) copies almost everything from kwargs into each previous_models entry — tools, full metadata dict, user_api_key_auth dumps, extra_headers with auth tokens, etc. With 3 retries and large tool schemas, this adds megabytes of redundant data to the spend log for each failed request.

Each failed attempt already has its own spend log entry with full error details, so duplicating all this data in previous_models is redundant.

Suggested Fix

Replace the full kwargs dump with a lightweight reference — just the request ID and error type for each failed attempt:

previous_model = {
    "litellm_call_id": kwargs.get("litellm_call_id"),
    "litellm_trace_id": kwargs.get("litellm_trace_id"),
    "model": kwargs.get("model"),
    "exception_type": type(e).__name__,
    "exception_string": str(e)[:200],  # truncated
}

This preserves the retry breadcrumb trail (which deployment failed and why) without leaking cross-request data or bloating the logs. The full details are already in each attempt's own spend log entry.

Environment

  • LiteLLM version: 1.81.14
  • Observed on gsk-prod cluster with Azure gpt-4.1

extent analysis

TL;DR

Replace the full kwargs dump in previous_models with a lightweight reference containing the request ID, error type, and other essential information to prevent cross-request data leakage and metadata bloat.

Guidance

  • Identify the self.previous_models attribute in router.py and modify it to store data per-request instead of per-instance to prevent cross-request data leakage.
  • Update the log_retry function in router.py to only store a lightweight reference to the failed attempt, including the request ID and error type, instead of copying all kwargs.
  • Verify that the previous_models data is correctly stored and retrieved for each request by checking the spend log entries for failed requests.
  • Consider implementing additional logging or monitoring to detect and prevent similar data leakage issues in the future.

Example

previous_model = {
    "litellm_call_id": kwargs.get("litellm_call_id"),
    "litellm_trace_id": kwargs.get("litellm_trace_id"),
    "model": kwargs.get("model"),
    "exception_type": type(e).__name__,
    "exception_string": str(e)[:200],  # truncated
}

Notes

This fix assumes that the litellm_call_id and litellm_trace_id are unique identifiers for each request and can be used to reference the full details of the failed attempt in the spend log.

Recommendation

Apply the suggested fix to replace the full kwargs dump with a lightweight reference, as it addresses both the cross-request data leakage and metadata bloat issues.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING