litellm - 💡(How to fix) Fix store_prompts_in_spend_logs: false still stores full embedding response vectors in SpendLogs [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24928Fetched 2026-04-08 02:23:42
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0

Fix Action

Workaround

Periodic cleanup:

UPDATE "LiteLLM_SpendLogs"
SET response = '{}'
WHERE call_type IN ('embedding', 'aembedding');

Code Example

SELECT call_type, length(response::text)
   FROM "LiteLLM_SpendLogs"
   WHERE call_type IN ('embedding', 'aembedding')
   ORDER BY "startTime" DESC
   LIMIT 5;

---

general_settings:
  store_prompts_in_spend_logs: false
  maximum_spend_logs_retention_period: "7d"

litellm_settings:
  callbacks: custom_callbacks.proxy_handler_instance
  cache: true
  cache_params:
    type: redis
    supported_call_types:
      - completion
      - acompletion

---

UPDATE "LiteLLM_SpendLogs"
SET response = '{}'
WHERE call_type IN ('embedding', 'aembedding');
RAW_BUFFERClick to expand / collapse

Bug Description

With store_prompts_in_spend_logs: false in general_settings, embedding responses (full float vectors) are still written to the response column in LiteLLM_SpendLogs. This causes significant DB bloat — each embedding request stores the complete vector array (e.g. 1024 floats for bge-m3).

Expected Behavior

With store_prompts_in_spend_logs: false, the response column should remain {} for all call types, including embedding and aembedding.

Actual Behavior

The response column contains the full embedding response including all float vectors, despite the flag being set to false.

Steps to Reproduce

  1. Set store_prompts_in_spend_logs: false in general_settings
  2. Send an embedding request through the proxy
  3. Query the DB:
    SELECT call_type, length(response::text)
    FROM "LiteLLM_SpendLogs"
    WHERE call_type IN ('embedding', 'aembedding')
    ORDER BY "startTime" DESC
    LIMIT 5;
  4. Observe that response contains the full vector data instead of {}

Environment

  • LiteLLM version: 1.82.3
  • Deployment: Docker (LiteLLM Proxy)
  • DB: PostgreSQL
  • Embedding model: BAAI/bge-m3 (1024 dimensions) via vLLM backend

Config (relevant section)

general_settings:
  store_prompts_in_spend_logs: false
  maximum_spend_logs_retention_period: "7d"

litellm_settings:
  callbacks: custom_callbacks.proxy_handler_instance
  cache: true
  cache_params:
    type: redis
    supported_call_types:
      - completion
      - acompletion

Impact

Each bge-m3 embedding response is ~15-20 KB of JSON. At high request volume this bloats the DB quickly. Chat completions may or may not be affected — not yet verified.

Workaround

Periodic cleanup:

UPDATE "LiteLLM_SpendLogs"
SET response = '{}'
WHERE call_type IN ('embedding', 'aembedding');

Related Issues

  • #15641 — inverse bug: store_prompts_in_spend_logs: true does NOT store messages/response
  • #23636 — data stored in proxy_server_request instead of messages

extent analysis

TL;DR

The issue can be mitigated by updating the code to correctly handle the store_prompts_in_spend_logs flag for embedding requests.

Guidance

  • Review the code responsible for handling embedding requests and storing responses in the LiteLLM_SpendLogs table to ensure it correctly checks the store_prompts_in_spend_logs flag.
  • Verify that the flag is being passed correctly to the relevant functions or methods.
  • Consider adding logging or debugging statements to track the value of the flag and the response data being stored.
  • Evaluate the periodic cleanup workaround provided, but note that it may not be a permanent solution and could have performance implications.

Example

No code snippet is provided as the issue does not include the relevant code sections.

Notes

The root cause of the issue is likely a bug in the code handling embedding requests, where the store_prompts_in_spend_logs flag is not being correctly checked or applied. The provided workaround may help mitigate the issue, but a permanent fix will require updating the code.

Recommendation

Apply the workaround by periodically running the provided SQL update statement to clean up the response column in the LiteLLM_SpendLogs table, but also prioritize investigating and fixing the underlying code issue to prevent further DB bloat.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix store_prompts_in_spend_logs: false still stores full embedding response vectors in SpendLogs [1 participants]