litellm - 💡(How to fix) Fix [Bug]: LiteLLM_SpendLogToolIndex is never pruned — grows unbounded, orphaned after spend log retention cleanup [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

LiteLLM_SpendLogToolIndex is insert-only and has no retention / pruning anywhere in the codebase. When the spend-log retention cleanup (maximum_spend_logs_retention_period) deletes old LiteLLM_SpendLogs rows, the matching SpendLogToolIndex rows are left behind, so orphaned tool-index rows accumulate and the table grows without bound — even when retention is correctly configured.

Root Cause

  1. No cascade. LiteLLM_SpendLogToolIndex has no foreign key to LiteLLM_SpendLogs; its only constraint is PRIMARY KEY (request_id, tool_name) (see schema.prisma and the ..._add_spend_log_tool_index/migration.sql). So deleting a SpendLogs row does not remove its tool-index rows.

  2. Cleanup deletes SpendLogs only. SpendLogCleanup._delete_old_logs (litellm/proxy/db/db_transaction_queue/spend_log_cleanup.py) runs:

    DELETE FROM "LiteLLM_SpendLogs"
    WHERE "request_id" IN (
        SELECT "request_id" FROM "LiteLLM_SpendLogs"
        WHERE "startTime" < $1::timestamptz LIMIT $2
    )

    It never touches LiteLLM_SpendLogToolIndex.

  3. No delete path exists anywhere. The only references to SpendLogToolIndex in the repo are the insert path (litellm/proxy/db/spend_log_tool_index.py, called from utils.py when spend logs are written), the read endpoints (management_endpoints/tool_management_endpoints.py), the Prisma schema, and the create migration. There is no delete on this table anywhere.

Fix Action

Fixed

Code Example

DELETE FROM "LiteLLM_SpendLogs"
   WHERE "request_id" IN (
       SELECT "request_id" FROM "LiteLLM_SpendLogs"
       WHERE "startTime" < $1::timestamptz LIMIT $2
   )

---

await prisma_client.db.execute_raw(
    """
    DELETE FROM "LiteLLM_SpendLogToolIndex"
    WHERE "ctid" IN (
        SELECT "ctid" FROM "LiteLLM_SpendLogToolIndex"
        WHERE "start_time" < $1::timestamptz
        LIMIT $2
    )
    """,
    cutoff_date,
    self.batch_size,
)
RAW_BUFFERClick to expand / collapse

Description

LiteLLM_SpendLogToolIndex is insert-only and has no retention / pruning anywhere in the codebase. When the spend-log retention cleanup (maximum_spend_logs_retention_period) deletes old LiteLLM_SpendLogs rows, the matching SpendLogToolIndex rows are left behind, so orphaned tool-index rows accumulate and the table grows without bound — even when retention is correctly configured.

Root cause

  1. No cascade. LiteLLM_SpendLogToolIndex has no foreign key to LiteLLM_SpendLogs; its only constraint is PRIMARY KEY (request_id, tool_name) (see schema.prisma and the ..._add_spend_log_tool_index/migration.sql). So deleting a SpendLogs row does not remove its tool-index rows.

  2. Cleanup deletes SpendLogs only. SpendLogCleanup._delete_old_logs (litellm/proxy/db/db_transaction_queue/spend_log_cleanup.py) runs:

    DELETE FROM "LiteLLM_SpendLogs"
    WHERE "request_id" IN (
        SELECT "request_id" FROM "LiteLLM_SpendLogs"
        WHERE "startTime" < $1::timestamptz LIMIT $2
    )

    It never touches LiteLLM_SpendLogToolIndex.

  3. No delete path exists anywhere. The only references to SpendLogToolIndex in the repo are the insert path (litellm/proxy/db/spend_log_tool_index.py, called from utils.py when spend logs are written), the read endpoints (management_endpoints/tool_management_endpoints.py), the Prisma schema, and the create migration. There is no delete on this table anywhere.

Impact

With retention enabled, SpendLogs is bounded but SpendLogToolIndex keeps growing forever. In a tool-heavy deployment we observed this table reach ~12M rows, of which ~28% had start_time older than the configured 30-day retention window (i.e. their parent SpendLogs rows had already been deleted — pure orphans). The table accumulates at a rate proportional to tool-using traffic with no upper bound.

Reproduction

  1. Set maximum_spend_logs_retention_period (e.g. "30d") and maximum_spend_logs_cleanup_cron.
  2. Send tool-using traffic so SpendLogToolIndex is populated.
  3. Wait past the retention window so the cleanup deletes old SpendLogs rows.
  4. Query SELECT count(*) FROM "LiteLLM_SpendLogToolIndex" WHERE start_time < now() - interval '30 days'; → returns a growing count of orphaned rows that the cleanup never removes.

Suggested fix

SpendLogToolIndex already has a start_time column, so it can be pruned on the same cutoff with the same batched pattern. In _delete_old_logs (or a sibling call in cleanup_old_spend_logs), add a batched delete:

await prisma_client.db.execute_raw(
    """
    DELETE FROM "LiteLLM_SpendLogToolIndex"
    WHERE "ctid" IN (
        SELECT "ctid" FROM "LiteLLM_SpendLogToolIndex"
        WHERE "start_time" < $1::timestamptz
        LIMIT $2
    )
    """,
    cutoff_date,
    self.batch_size,
)

(Indexing start_time would make this efficient; the existing index is (tool_name, start_time), which doesn't serve a start_time-only range scan well.) Alternatively, add an ON DELETE CASCADE FK from SpendLogToolIndex.request_id to SpendLogs.request_id, though the start_time batched delete is lighter and matches the existing cleanup design.

The same gap likely applies to other derived/aggregate tables, but SpendLogToolIndex is the fastest-growing one in tool-heavy deployments.

Version

Observed on v1.83.14; the cleanup logic on latest main is unchanged (still deletes LiteLLM_SpendLogs only).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [Bug]: LiteLLM_SpendLogToolIndex is never pruned — grows unbounded, orphaned after spend log retention cleanup [1 pull requests]