litellm - ✅(Solved) Fix [Bug]: api_base is always empty in LiteLLM_SpendLogs for embedding calls (aembedding call_type) [1 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#23768Fetched 2026-04-08 00:49:12
View on GitHub
Comments
1
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
referenced ×3cross-referenced ×1labeled ×1

Root Cause

Root cause

Fix Action

Fixed

PR fix notes

PR #23770: fix(spend_tracking): populate api_base in SpendLogs for embedding calls

Description (problem / solution / changelog)

Summary

Fixes #23768 — api_base column in LiteLLM_SpendLogs is always empty for embedding requests (call_type = 'aembedding'), making per-endpoint monitoring impossible for embedding models.

Root Cause

The Azure and OpenAI embedding code paths did not include api_base in the additional_args passed to logging_obj.pre_call() and logging_obj.post_call(). The _pre_call() method in litellm_logging.py sets litellm_params["api_base"] from additional_args.get("api_base", ""), so without it the value remains empty and propagates to both StandardLoggingPayload and SpendLogsPayload.

Chat completion calls work correctly because their handlers already pass api_base in additional_args.

Changes

Provider fixes

  • Azure embedding() (litellm/llms/azure/azure.py): Add "api_base": api_base to pre_call additional_args
  • Azure aembedding() (litellm/llms/azure/azure.py): Add "api_base": api_base to post_call additional_args in both success and error paths
  • OpenAI aembedding() (litellm/llms/openai/openai.py): Add "api_base": api_base to post_call additional_args in success, OpenAIError, and generic Exception paths
  • Cohere sync embedding() (litellm/llms/cohere/embed/handler.py): Add "api_base": embed_url to pre_call additional_args

Tests

  • test_get_logging_payload_api_base_populated_for_embedding_calls: Verifies that api_base is populated in the SpendLogs payload when litellm_params contains it for an embedding call
  • test_get_logging_payload_api_base_empty_when_not_in_litellm_params: Verifies graceful fallback to empty string when api_base is not available

Evidence from production

-- Before fix: api_base always empty for embeddings
SELECT model_group, model, api_base, COUNT(*) FROM "LiteLLM_SpendLogs"
WHERE model_group = 'text-embedding-3-large' GROUP BY 1, 2, api_base;

 model_group             | model                          | api_base | count
-------------------------+--------------------------------+----------+-------
 text-embedding-3-large  | azure/text-embedding-3-large   |          | 26901

-- Chat completions work fine (for comparison)
SELECT model_group, model, api_base, COUNT(*) FROM "LiteLLM_SpendLogs"
WHERE model_group = 'gpt-4.1-mini' GROUP BY 1, 2, api_base;

 model_group   | model                | api_base                                  | count
---------------+----------------------+-------------------------------------------+-------
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-1.openai.azure.com/   | 11115
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-2.openai.azure.com/   | 10928

Testing

All 46 tests in test_spend_tracking_utils.py pass (including 2 new ones):

======================== 46 passed, 1 warning in 4.68s =========================

Changed files

  • litellm/llms/azure/azure.py (modified, +6/-2)
  • litellm/llms/cohere/embed/handler.py (modified, +3/-3)
  • litellm/llms/openai/openai.py (modified, +12/-7)
  • tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py (modified, +319/-130)

Code Example

-- Embedding calls: api_base is always empty
SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'text-embedding-3-large'
  AND team_id != 'litellm-internal-health-check'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

 model_group             | model                          | api_base | cnt
-------------------------+--------------------------------+----------+-------
 text-embedding-3-large  | azure/text-embedding-3-large   |          | 26901

-- Chat completion calls: api_base is correctly populated
SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'gpt-4.1-mini'
  AND team_id != 'litellm-internal-health-check'
  AND "startTime" >= NOW() - INTERVAL '1 day'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

 model_group   | model                | api_base                                           | cnt
---------------+----------------------+----------------------------------------------------+-------
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-1.openai.azure.com/            | 11115
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-2.openai.azure.com/            | 10928
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-3.openai.azure.com/            | 10583

---

api_base=litellm_params.get("api_base", ""),

---

SELECT model_group, model, team_id, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model = 'text-embedding-3-large' AND (model_group IS NULL OR model_group = '')
GROUP BY 1, 2, team_id;

 model_group | model                   | team_id                          | cnt
-------------+-------------------------+----------------------------------+--------
             | text-embedding-3-large  | litellm-internal-health-check    | 730088
RAW_BUFFERClick to expand / collapse

What happened?

The api_base column in LiteLLM_SpendLogs is always empty for embedding requests (call_type = 'aembedding'), while it is correctly populated for chat completion requests (call_type = 'acompletion').

This makes it impossible to build per-endpoint monitoring dashboards (e.g. Grafana) that track RPM/TPM utilization, latency, or token usage per Azure OpenAI endpoint for embedding models.

Evidence from database

-- Embedding calls: api_base is always empty
SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'text-embedding-3-large'
  AND team_id != 'litellm-internal-health-check'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

 model_group             | model                          | api_base | cnt
-------------------------+--------------------------------+----------+-------
 text-embedding-3-large  | azure/text-embedding-3-large   |          | 26901

-- Chat completion calls: api_base is correctly populated
SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'gpt-4.1-mini'
  AND team_id != 'litellm-internal-health-check'
  AND "startTime" >= NOW() - INTERVAL '1 day'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

 model_group   | model                | api_base                                           | cnt
---------------+----------------------+----------------------------------------------------+-------
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-1.openai.azure.com/            | 11115
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-2.openai.azure.com/            | 10928
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-3.openai.azure.com/            | 10583

Root cause

In litellm/proxy/spend_tracking/spend_tracking_utils.py, the get_logging_payload() function extracts api_base from litellm_params:

api_base=litellm_params.get("api_base", ""),

For chat completion calls (acompletion), litellm_params["api_base"] is correctly populated with the provider endpoint URL. However, for embedding calls (aembedding), the api_base key is either missing or empty in litellm_params by the time it reaches the spend logging callback.

This appears to be because the embedding code path does not propagate api_base into litellm_params the same way the completion code path does.

Secondary issue: health check embedding requests have empty model_group

Additionally, health check embedding requests (from litellm-internal-health-check) are logged with an empty model_group, while the model field is set to text-embedding-3-large (without the azure/ prefix). This is inconsistent with how chat model health checks are logged and creates ~730k orphaned rows that can't be associated with any model group:

SELECT model_group, model, team_id, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model = 'text-embedding-3-large' AND (model_group IS NULL OR model_group = '')
GROUP BY 1, 2, team_id;

 model_group | model                   | team_id                          | cnt
-------------+-------------------------+----------------------------------+--------
             | text-embedding-3-large  | litellm-internal-health-check    | 730088

Expected behavior

  1. api_base should be populated in LiteLLM_SpendLogs for embedding calls, just as it is for chat completion calls
  2. Health check requests for embedding models should have model_group populated consistently with non-health-check requests

Related issues

This was previously reported in #7317 (Dec 2024) but was auto-closed as stale without being fixed.

Are you a ML Ops Team?

No

What LiteLLM version are you on?

v1.67.4.dev4 (ghcr.io/berriai/litellm:main-v1.67.4.dev4)

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue, we need to modify the get_logging_payload() function in litellm/proxy/spend_tracking/spend_tracking_utils.py to correctly extract and populate the api_base field for embedding calls. We also need to ensure that health check embedding requests have a populated model_group field.

Step 1: Modify get_logging_payload() function

def get_logging_payload(litellm_params, ...):
    # ...
    api_base = litellm_params.get("api_base", "")
    if not api_base and litellm_params.get("call_type") == "aembedding":
        # Extract api_base from litellm_params for embedding calls
        api_base = litellm_params.get("endpoint_url", "")
    # ...
    return {
        # ...
        "api_base": api_base,
        # ...
    }

Step 2: Populate model_group for health check embedding requests

def get_logging_payload(litellm_params, ...):
    # ...
    model_group = litellm_params.get("model_group", "")
    if not model_group and litellm_params.get("team_id") == "litellm-internal-health-check":
        # Populate model_group for health check embedding requests
        model_group = "azure/" + litellm_params.get("model", "").split("/")[-1]
    # ...
    return {
        # ...
        "model_group": model_group,
        # ...
    }

Verification

To verify the fix, run the following SQL queries:

SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'text-embedding-3-large'
  AND team_id != 'litellm-internal-health-check'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

SELECT model_group, model, team_id, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model = 'text-embedding-3-large' AND team_id = 'litellm-internal-health-check'
GROUP BY 1, 2, team_id;

The api_base field should be populated for embedding calls, and the model_group field should be populated for health check embedding requests.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  1. api_base should be populated in LiteLLM_SpendLogs for embedding calls, just as it is for chat completion calls
  2. Health check requests for embedding models should have model_group populated consistently with non-health-check requests

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - ✅(Solved) Fix [Bug]: api_base is always empty in LiteLLM_SpendLogs for embedding calls (aembedding call_type) [1 pull requests, 1 comments, 1 participants]