litellm - ✅(Solved) Fix [Bug]: HuggingFace embedding call does not forward extra_headers (e.g. X-HF-Bill-To) [1 pull requests, 1 participants]

litellm2026-03-13 00:51:10

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#23502•Fetched 2026-04-08 00:44:00

View on GitHub

Comments

Participants

Timeline

Reactions

Author

zhongheng-cheng-softbank-co-jp

Participants

zhongheng-cheng-softbank-co-jp

Timeline (top)

referenced ×9cross-referenced ×3labeled ×3closed ×1

Error Message

No error is raised — the headers are silently ignored because the handler defaults

Root Cause

The X-HF-Bill-To header never reaches the HuggingFace API because headers is not passed from main.py:5096 to huggingface_embed.embedding().

Fix Action

Fixed

Fixed by PR: fix(huggingface): pass headers to HF embedding call so extra_headers are forwarded (https://github.com/BerriAI/litellm/pull/23504)

PR fix notes

PR #23504: fix(huggingface): pass headers to HF embedding call so extra_headers are forwarded

Repository: BerriAI/litellm
Author: zhongheng-cheng-softbank-co-jp
State: closed | merged: False
Link: https://github.com/BerriAI/litellm/pull/23504

Description (problem / solution / changelog)

Relevant issues

Fixes #23502

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🐛 Bug Fix ✅ Test

Changes

The embedding() function in litellm/main.py accepts a headers parameter containing user-specified extra_headers. All other provider branches (openrouter, vercel_ai_gateway, bedrock, etc.) forward this parameter to their respective handlers, but the huggingface branch does not.

This causes headers like X-HF-Bill-To (used to bill HuggingFace Inference API usage to an organization) to be silently dropped.

The HuggingFace embedding handler (litellm/llms/huggingface/embedding/handler.py) already accepts a headers parameter and merges it with auth headers via validate_environment() — it simply never receives the caller's headers.

Changed files

litellm/main.py (modified, +1/-0)
tests/test_litellm/llms/huggingface/embedding/test_huggingface_embedding_handler.py (modified, +15/-0)

Code Example

import litellm

# extra_headers should be forwarded to HuggingFace API
response = litellm.embedding(
    model="huggingface/BAAI/bge-small-en-v1.5",
    input=["hello world"],
    extra_headers={"X-HF-Bill-To": "my-org"},  # This header is silently dropped
)

---

No error is raised — the headers are silently ignored because the handler defaults
headers={} and merges with auth headers only.

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When calling litellm.embedding() with custom_llm_provider="huggingface", the headers parameter from embedding() in litellm/main.py is not passed to huggingface_embed.embedding(). This means any extra_headers set by the user (e.g. X-HF-Bill-To for HuggingFace organization billing) are silently dropped.

Other providers in the same function (e.g. openrouter, vercel_ai_gateway, bedrock) already pass headers correctly. Only the huggingface branch is missing it.

Steps to Reproduce

import litellm

# extra_headers should be forwarded to HuggingFace API
response = litellm.embedding(
    model="huggingface/BAAI/bge-small-en-v1.5",
    input=["hello world"],
    extra_headers={"X-HF-Bill-To": "my-org"},  # This header is silently dropped
)

The X-HF-Bill-To header never reaches the HuggingFace API because headers is not passed from main.py:5096 to huggingface_embed.embedding().

Relevant log output

No error is raised — the headers are silently ignored because the handler defaults
headers={} and merges with auth headers only.

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

v1.82.1

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue, we need to pass the headers parameter from litellm.embedding() to huggingface_embed.embedding().

Here are the steps:

Modify the litellm.embedding() function in litellm/main.py to pass the headers parameter to huggingface_embed.embedding().
Update the huggingface_embed.embedding() function to accept and use the headers parameter.

Code Changes

# In litellm/main.py
def embedding(..., custom_llm_provider="huggingface", ..., headers=None):
    ...
    if custom_llm_provider == "huggingface":
        return huggingface_embed.embedding(..., headers=headers)  # Pass headers here
    ...

# In huggingface_embed.py
def embedding(..., headers=None):
    ...
    # Use the headers parameter when making the API call
    response = requests.post(url, ..., headers=headers)
    ...

Verification

To verify the fix, you can use the following code:

import litellm

response = litellm.embedding(
    model="huggingface/BAAI/bge-small-en-v1.5",
    input=["hello world"],
    extra_headers={"X-HF-Bill-To": "my-org"},
)

# Check if the X-HF-Bill-To header is present in the response
print(response.request.headers)

If the fix is successful, the X-HF-Bill-To header should be present in the response.

Extra Tips

Make sure to test the fix thoroughly to ensure that it works as expected and does not introduce any new issues. Additionally, consider adding logging or monitoring to detect any similar issues in the future.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #training loop #device allocation #model download #tokenizer error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.