1. `api_base` should be populated in `LiteLLM_SpendLogs` for embedding calls, just as it is for chat completion calls 2. Health check requests for embedding models should have `model_group` populated consistently with non-health-check requests

litellm - ✅(Solved) Fix [Bug]: api_base is always empty in LiteLLM_SpendLogs for embedding calls (aembedding call_type) [1 pull requests, 1 comments, 1 participants]

litellm2026-03-16 18:54:28

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#23768•Fetched 2026-04-08 00:49:12

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ClemDNL

Participants

ClemDNL

Timeline (top)

referenced ×3cross-referenced ×1labeled ×1

Root Cause

Root cause

Fix Action

Fixed

Fixed by PR: fix(spend_tracking): populate api_base in SpendLogs for embedding calls (https://github.com/BerriAI/litellm/pull/23770)

PR fix notes

PR #23770: fix(spend_tracking): populate api_base in SpendLogs for embedding calls

Repository: BerriAI/litellm
Author: ClemDNL
State: closed | merged: True
Link: https://github.com/BerriAI/litellm/pull/23770

Description (problem / solution / changelog)

Summary

Fixes #23768 — api_base column in LiteLLM_SpendLogs is always empty for embedding requests (call_type = 'aembedding'), making per-endpoint monitoring impossible for embedding models.

Root Cause

The Azure and OpenAI embedding code paths did not include api_base in the additional_args passed to logging_obj.pre_call() and logging_obj.post_call(). The _pre_call() method in litellm_logging.py sets litellm_params["api_base"] from additional_args.get("api_base", ""), so without it the value remains empty and propagates to both StandardLoggingPayload and SpendLogsPayload.

Chat completion calls work correctly because their handlers already pass api_base in additional_args.

Changes

Provider fixes

Azure embedding() (litellm/llms/azure/azure.py): Add "api_base": api_base to pre_call additional_args
Azure aembedding() (litellm/llms/azure/azure.py): Add "api_base": api_base to post_call additional_args in both success and error paths
OpenAI aembedding() (litellm/llms/openai/openai.py): Add "api_base": api_base to post_call additional_args in success, OpenAIError, and generic Exception paths
Cohere sync embedding() (litellm/llms/cohere/embed/handler.py): Add "api_base": embed_url to pre_call additional_args

Tests

test_get_logging_payload_api_base_populated_for_embedding_calls: Verifies that api_base is populated in the SpendLogs payload when litellm_params contains it for an embedding call
test_get_logging_payload_api_base_empty_when_not_in_litellm_params: Verifies graceful fallback to empty string when api_base is not available

Evidence from production

-- Before fix: api_base always empty for embeddings
SELECT model_group, model, api_base, COUNT(*) FROM "LiteLLM_SpendLogs"
WHERE model_group = 'text-embedding-3-large' GROUP BY 1, 2, api_base;

 model_group             | model                          | api_base | count
-------------------------+--------------------------------+----------+-------
 text-embedding-3-large  | azure/text-embedding-3-large   |          | 26901

-- Chat completions work fine (for comparison)
SELECT model_group, model, api_base, COUNT(*) FROM "LiteLLM_SpendLogs"
WHERE model_group = 'gpt-4.1-mini' GROUP BY 1, 2, api_base;

 model_group   | model                | api_base                                  | count
---------------+----------------------+-------------------------------------------+-------
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-1.openai.azure.com/   | 11115
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-2.openai.azure.com/   | 10928

Testing

All 46 tests in test_spend_tracking_utils.py pass (including 2 new ones):

======================== 46 passed, 1 warning in 4.68s =========================

Changed files

litellm/llms/azure/azure.py (modified, +6/-2)
litellm/llms/cohere/embed/handler.py (modified, +3/-3)
litellm/llms/openai/openai.py (modified, +12/-7)
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py (modified, +319/-130)

Code Example

-- Embedding calls: api_base is always empty
SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'text-embedding-3-large'
  AND team_id != 'litellm-internal-health-check'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

 model_group             | model                          | api_base | cnt
-------------------------+--------------------------------+----------+-------
 text-embedding-3-large  | azure/text-embedding-3-large   |          | 26901

-- Chat completion calls: api_base is correctly populated
SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'gpt-4.1-mini'
  AND team_id != 'litellm-internal-health-check'
  AND "startTime" >= NOW() - INTERVAL '1 day'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

 model_group   | model                | api_base                                           | cnt
---------------+----------------------+----------------------------------------------------+-------
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-1.openai.azure.com/            | 11115
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-2.openai.azure.com/            | 10928
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-3.openai.azure.com/            | 10583

---

api_base=litellm_params.get("api_base", ""),

---

SELECT model_group, model, team_id, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model = 'text-embedding-3-large' AND (model_group IS NULL OR model_group = '')
GROUP BY 1, 2, team_id;

 model_group | model                   | team_id                          | cnt
-------------+-------------------------+----------------------------------+--------
             | text-embedding-3-large  | litellm-internal-health-check    | 730088

RAW_BUFFERClick to expand / collapse

What happened?

The api_base column in LiteLLM_SpendLogs is always empty for embedding requests (call_type = 'aembedding'), while it is correctly populated for chat completion requests (call_type = 'acompletion').

This makes it impossible to build per-endpoint monitoring dashboards (e.g. Grafana) that track RPM/TPM utilization, latency, or token usage per Azure OpenAI endpoint for embedding models.

Evidence from database

-- Embedding calls: api_base is always empty
SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'text-embedding-3-large'
  AND team_id != 'litellm-internal-health-check'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

 model_group             | model                          | api_base | cnt
-------------------------+--------------------------------+----------+-------
 text-embedding-3-large  | azure/text-embedding-3-large   |          | 26901

-- Chat completion calls: api_base is correctly populated
SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'gpt-4.1-mini'
  AND team_id != 'litellm-internal-health-check'
  AND "startTime" >= NOW() - INTERVAL '1 day'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

 model_group   | model                | api_base                                           | cnt
---------------+----------------------+----------------------------------------------------+-------
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-1.openai.azure.com/            | 11115
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-2.openai.azure.com/            | 10928
 gpt-4.1-mini  | azure/gpt-4.1-mini   | https://my-endpoint-3.openai.azure.com/            | 10583

Root cause

In litellm/proxy/spend_tracking/spend_tracking_utils.py, the get_logging_payload() function extracts api_base from litellm_params:

api_base=litellm_params.get("api_base", ""),

For chat completion calls (acompletion), litellm_params["api_base"] is correctly populated with the provider endpoint URL. However, for embedding calls (aembedding), the api_base key is either missing or empty in litellm_params by the time it reaches the spend logging callback.

This appears to be because the embedding code path does not propagate api_base into litellm_params the same way the completion code path does.

Secondary issue: health check embedding requests have empty `model_group`

Additionally, health check embedding requests (from litellm-internal-health-check) are logged with an empty model_group, while the model field is set to text-embedding-3-large (without the azure/ prefix). This is inconsistent with how chat model health checks are logged and creates ~730k orphaned rows that can't be associated with any model group:

SELECT model_group, model, team_id, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model = 'text-embedding-3-large' AND (model_group IS NULL OR model_group = '')
GROUP BY 1, 2, team_id;

 model_group | model                   | team_id                          | cnt
-------------+-------------------------+----------------------------------+--------
             | text-embedding-3-large  | litellm-internal-health-check    | 730088

Expected behavior

api_base should be populated in LiteLLM_SpendLogs for embedding calls, just as it is for chat completion calls
Health check requests for embedding models should have model_group populated consistently with non-health-check requests

Related issues

This was previously reported in #7317 (Dec 2024) but was auto-closed as stale without being fixed.

Are you a ML Ops Team?

What LiteLLM version are you on?

v1.67.4.dev4 (ghcr.io/berriai/litellm:main-v1.67.4.dev4)

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue, we need to modify the get_logging_payload() function in litellm/proxy/spend_tracking/spend_tracking_utils.py to correctly extract and populate the api_base field for embedding calls. We also need to ensure that health check embedding requests have a populated model_group field.

Step 1: Modify `get_logging_payload()` function

def get_logging_payload(litellm_params, ...):
    # ...
    api_base = litellm_params.get("api_base", "")
    if not api_base and litellm_params.get("call_type") == "aembedding":
        # Extract api_base from litellm_params for embedding calls
        api_base = litellm_params.get("endpoint_url", "")
    # ...
    return {
        # ...
        "api_base": api_base,
        # ...
    }

Step 2: Populate `model_group` for health check embedding requests

def get_logging_payload(litellm_params, ...):
    # ...
    model_group = litellm_params.get("model_group", "")
    if not model_group and litellm_params.get("team_id") == "litellm-internal-health-check":
        # Populate model_group for health check embedding requests
        model_group = "azure/" + litellm_params.get("model", "").split("/")[-1]
    # ...
    return {
        # ...
        "model_group": model_group,
        # ...
    }

Verification

To verify the fix, run the following SQL queries:

SELECT model_group, model, api_base, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model_group = 'text-embedding-3-large'
  AND team_id != 'litellm-internal-health-check'
GROUP BY 1, 2, api_base
ORDER BY cnt DESC;

SELECT model_group, model, team_id, COUNT(*) as cnt
FROM "LiteLLM_SpendLogs"
WHERE model = 'text-embedding-3-large' AND team_id = 'litellm-internal-health-check'
GROUP BY 1, 2, team_id;

The api_base field should be populated for embedding calls, and the model_group field should be populated for health check embedding requests.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

api_base should be populated in LiteLLM_SpendLogs for embedding calls, just as it is for chat completion calls
Health check requests for embedding models should have model_group populated consistently with non-health-check requests

#api #ssr #installation #tensor shape #autograd error #index setup #retrieval issue #search optimization #API routing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: api_base is always empty in LiteLLM_SpendLogs for embedding calls (aembedding call_type) [1 pull requests, 1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Root cause

Fix Action

Fixed

PR fix notes

PR #23770: fix(spend_tracking): populate api_base in SpendLogs for embedding calls

Description (problem / solution / changelog)

Summary

Root Cause

Changes

Provider fixes

Tests

Evidence from production

Testing

Changed files

Code Example

What happened?

Evidence from database

Root cause

Secondary issue: health check embedding requests have empty model_group

Expected behavior

Related issues

Are you a ML Ops Team?

What LiteLLM version are you on?

Twitter / LinkedIn details

extent analysis

Fix Plan

Step 1: Modify get_logging_payload() function

Step 2: Populate model_group for health check embedding requests

Verification

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Secondary issue: health check embedding requests have empty `model_group`

Step 1: Modify `get_logging_payload()` function

Step 2: Populate `model_group` for health check embedding requests