litellm - ✅(Solved) Fix [Bug]: Gemini embedding batch request sends `max_tokens` causing 400 BadRequestError [1 pull requests, 1 participants]

thiagomendonca-eu · 2026-03-21T14:14:34Z

[litellm] PR 24370: fix gemini : filter params from embedding requests - Repository: BerriAI/litellm - Author: Chesars - State: closed | merged: True - Link: h… # PR #24370: fix(gemini): filter params from embedding requests - Repository: BerriAI/litellm - Author: Chesars - State: closed | merged: True - Link: https://github.com/BerriAI/litellm/pull/24370 ## Description (problem / solution / changelog) ## Relevant issues Fixes https://github.com/berriAI/litellm/issues/24293 ## Pre-Submission checklist - [x] I have Added testing in the [`tests/test_litellm/`](https://github.com/BerriAI/litellm/tree/main/tests/test_litellm) directory, **Adding at least 1 test is a hard requirement** - [see details](https://docs.litellm.ai/docs/extras/contributing_code) - [x] My PR passes all unit tests on [`make test-unit`](https://docs.litellm.ai/docs/extras/contributing_code) - [x] My PR's scope is as isolated as possible, it only solves 1 specific problem - [ ] I have requested a Greptile review by commenting `@greptileai` and received a **Confidence Score of at least 4/5** before requesting a maintainer review ## Type 🐛 Bug Fix ## Changes The Gemini batch embedding transformation spreads all `optional_params` into the request body via `**gemini_params`. Params like `max_tokens` (injected by `add_provider_specific_params_to_optional_params` for non-OpenAI providers) reach the Gemini API and cause a `400 BadRequestError`. `drop_params: true` doesn't prevent this because the param is re-injected after the drop_params check runs. Extract `_filter_embed_params()` that maps `dimensions`/`task_type` and keeps only the fields Gemini embeddings accept (`outputDimensionality`, `taskType`, `title`). Applied to both `transform_openai_input_gemini_content` and `transform_openai_input_gemini_embed_content`. ### Tests added - `test_filter_embed_params_drops_unsupported` — verifies max_tokens/temperature are filtered - `test_filter_embed_params_keeps_supported` — verifies dimensions/task_type/title pass through - `test_batch_embed_content_drops_max_tokens` — integration test for batchEmbedContents path - `test_embed_content_drops_max_tokens` — integration test for embedContent path ## Changed files - `litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py` (modified, +15/-10) - `tests/litellm/llms/vertex_ai/test_gemini_batch_embeddings.py` (modified, +48/-0) ## Workaround Pin the LiteLLM image to `v1.81.9`: ```yaml image: ghcr.io/berriai/litellm:main-v1.81.9-stable ``` ## What happened? When calling Gemini embeddings through the LiteLLM proxy, the batch embedding handler sends a `max_tokens` parameter in the request payload. The Gemini API does not accept `max_tokens` for embedding requests, returning a `400 BadRequestError`. This is a regression — **v1.81.9 works correctly**, but the latest `main-stable` (v1.82.3 pulled on 2026-03-21) sends the invalid parameter. ## Error ``` litellm.BadRequestError: GeminiException BadRequestError - { "error": { "code": 400, "message": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field.", "status": "INVALID_ARGUMENT", "details": [ { "@type": "type.googleapis.com/google.rpc.BadRequest", "fieldViolations": [ { "field": "requests[0]", "description": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field." } ] } ] } } ``` ## Stack Trace ``` File "/usr/lib/python3.13/site-packages/litellm/main.py", line 4585, in aembedding response = await init_response File "/usr/lib/python3.13/site-packages/litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py", line 322, in async_batch_embeddings response = await async_handler.post(...) File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post raise e File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post response.raise_for_status() ``` ## Relevant Code The issue is in `litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py`. The `max_tokens` parameter is being included in the batch embedding request payload sent to the Gemini API, but this field is not part of the Gemini Batch Embed Content API schema. ## How to Reproduce 1. Configure LiteLLM proxy with a Gemini embedding model (e.g., `gemini/text-embedding-004`) 2. Send an embedding request through the proxy 3. The request fails with the above `400 BadRequestError` ### LiteLLM Config ```yaml litellm_settings: drop_params: true # Note: drop_params is enabled but doesn't prevent this ``` **Note:** Even with `drop_params: true` in litellm_settings, the error still occurs — suggesting the parameter is added after the drop_params logic runs, or drop_params doesn't apply to the embedding batch handler. ## Expected Behavior The batch embedding request should NOT include `max_tokens` in the payload sent to the Gemini API. Embedding endpoints don't use `max_tokens`. ## Environment - **LiteLLM ve

litellm2026-03-21 14:14:34

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24293•Fetched 2026-04-08 01:13:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

thiagomendonca-eu

Participants

thiagomendonca-eu

Timeline (top)

subscribed ×2cross-referenced ×1labeled ×1referenced ×1

Error Message

litellm.BadRequestError: GeminiException BadRequestError - {
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "requests[0]",
            "description": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field."
          }
        ]
      }
    ]
  }
}

Fix Action

Workaround

Pin the LiteLLM image to v1.81.9:

image: ghcr.io/berriai/litellm:main-v1.81.9-stable

PR fix notes

PR #24370: fix(gemini): filter params from embedding requests

Repository: BerriAI/litellm
Author: Chesars
State: closed | merged: True
Link: https://github.com/BerriAI/litellm/pull/24370

Description (problem / solution / changelog)

Relevant issues

Fixes https://github.com/berriAI/litellm/issues/24293

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

The Gemini batch embedding transformation spreads all optional_params into the request body via **gemini_params. Params like max_tokens (injected by add_provider_specific_params_to_optional_params for non-OpenAI providers) reach the Gemini API and cause a 400 BadRequestError.

drop_params: true doesn't prevent this because the param is re-injected after the drop_params check runs.

Extract _filter_embed_params() that maps dimensions/task_type and keeps only the fields Gemini embeddings accept (outputDimensionality, taskType, title). Applied to both transform_openai_input_gemini_content and transform_openai_input_gemini_embed_content.

Tests added

test_filter_embed_params_drops_unsupported — verifies max_tokens/temperature are filtered
test_filter_embed_params_keeps_supported — verifies dimensions/task_type/title pass through
test_batch_embed_content_drops_max_tokens — integration test for batchEmbedContents path
test_embed_content_drops_max_tokens — integration test for embedContent path

Changed files

litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py (modified, +15/-10)
tests/litellm/llms/vertex_ai/test_gemini_batch_embeddings.py (modified, +48/-0)

Code Example

litellm.BadRequestError: GeminiException BadRequestError - {
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "requests[0]",
            "description": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field."
          }
        ]
      }
    ]
  }
}

---

File "/usr/lib/python3.13/site-packages/litellm/main.py", line 4585, in aembedding
    response = await init_response
File "/usr/lib/python3.13/site-packages/litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py", line 322, in async_batch_embeddings
    response = await async_handler.post(...)
File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post
    raise e
File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post
    response.raise_for_status()

---

litellm_settings:
  drop_params: true   # Note: drop_params is enabled but doesn't prevent this

---

image: ghcr.io/berriai/litellm:main-v1.81.9-stable

RAW_BUFFERClick to expand / collapse

What happened?

When calling Gemini embeddings through the LiteLLM proxy, the batch embedding handler sends a max_tokens parameter in the request payload. The Gemini API does not accept max_tokens for embedding requests, returning a 400 BadRequestError.

This is a regression — v1.81.9 works correctly, but the latest main-stable (v1.82.3 pulled on 2026-03-21) sends the invalid parameter.

Error

litellm.BadRequestError: GeminiException BadRequestError - {
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "requests[0]",
            "description": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field."
          }
        ]
      }
    ]
  }
}

Stack Trace

File "/usr/lib/python3.13/site-packages/litellm/main.py", line 4585, in aembedding
    response = await init_response
File "/usr/lib/python3.13/site-packages/litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py", line 322, in async_batch_embeddings
    response = await async_handler.post(...)
File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post
    raise e
File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post
    response.raise_for_status()

Relevant Code

The issue is in litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py. The max_tokens parameter is being included in the batch embedding request payload sent to the Gemini API, but this field is not part of the Gemini Batch Embed Content API schema.

How to Reproduce

Configure LiteLLM proxy with a Gemini embedding model (e.g., gemini/text-embedding-004)
Send an embedding request through the proxy
The request fails with the above 400 BadRequestError

LiteLLM Config

litellm_settings:
  drop_params: true   # Note: drop_params is enabled but doesn't prevent this

Note: Even with drop_params: true in litellm_settings, the error still occurs — suggesting the parameter is added after the drop_params logic runs, or drop_params doesn't apply to the embedding batch handler.

Expected Behavior

The batch embedding request should NOT include max_tokens in the payload sent to the Gemini API. Embedding endpoints don't use max_tokens.

Environment

LiteLLM version (broken): latest main-stable (pulled 2026-03-21, Python 3.13)
LiteLLM version (working): v1.81.9
Docker image: ghcr.io/berriai/litellm:main-stable
Embedding model: gemini/text-embedding-004
Provider: Google Gemini (not Vertex AI)

Workaround

Pin the LiteLLM image to v1.81.9:

image: ghcr.io/berriai/litellm:main-v1.81.9-stable

extent analysis

Fix Plan

To fix the issue, we need to remove the max_tokens parameter from the batch embedding request payload sent to the Gemini API.

Here are the steps:

Locate the batch_embed_content_handler.py file in litellm/llms/vertex_ai/gemini_embeddings/.
Find the function async_batch_embeddings where the request payload is constructed.
Remove the max_tokens parameter from the payload.

Example code snippet:

# Before
payload = {
    'requests': [
        {
            'input': input_text,
            'max_tokens': max_tokens  # Remove this line
        }
    ]
}

# After
payload = {
    'requests': [
        {
            'input': input_text
        }
    ]
}

Alternatively, you can modify the drop_params logic to exclude the max_tokens parameter from the payload.

Verification

To verify that the fix worked, you can:

Send an embedding request through the LiteLLM proxy.
Check the request payload sent to the Gemini API to ensure it does not include the max_tokens parameter.
Verify that the request is successful and returns the expected response.

Extra Tips

Make sure to test the fix thoroughly to ensure it does not introduce any new issues.
Consider adding a check to ensure that the max_tokens parameter is not included in the payload for embedding requests.
If you are using a version control system, make sure to commit the changes and update the documentation accordingly.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #orchestration issue #cache issue #memory leak #API versioning

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: Gemini embedding batch request sends `max_tokens` causing 400 BadRequestError [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Workaround

PR fix notes

PR #24370: fix(gemini): filter params from embedding requests

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Type

Changes

Tests added

Changed files

Code Example

What happened?

Error

Stack Trace

Relevant Code

How to Reproduce

LiteLLM Config

Expected Behavior

Environment

Workaround

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING