litellm - ✅(Solved) Fix [Bug]: Gemini embedding batch request sends `max_tokens` causing 400 BadRequestError [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24293Fetched 2026-04-08 01:13:31
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
subscribed ×2cross-referenced ×1labeled ×1referenced ×1

Error Message

litellm.BadRequestError: GeminiException BadRequestError - {
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "requests[0]",
            "description": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field."
          }
        ]
      }
    ]
  }
}

Fix Action

Workaround

Pin the LiteLLM image to v1.81.9:

image: ghcr.io/berriai/litellm:main-v1.81.9-stable

PR fix notes

PR #24370: fix(gemini): filter params from embedding requests

Description (problem / solution / changelog)

Relevant issues

Fixes https://github.com/berriAI/litellm/issues/24293

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

The Gemini batch embedding transformation spreads all optional_params into the request body via **gemini_params. Params like max_tokens (injected by add_provider_specific_params_to_optional_params for non-OpenAI providers) reach the Gemini API and cause a 400 BadRequestError.

drop_params: true doesn't prevent this because the param is re-injected after the drop_params check runs.

Extract _filter_embed_params() that maps dimensions/task_type and keeps only the fields Gemini embeddings accept (outputDimensionality, taskType, title). Applied to both transform_openai_input_gemini_content and transform_openai_input_gemini_embed_content.

Tests added

  • test_filter_embed_params_drops_unsupported — verifies max_tokens/temperature are filtered
  • test_filter_embed_params_keeps_supported — verifies dimensions/task_type/title pass through
  • test_batch_embed_content_drops_max_tokens — integration test for batchEmbedContents path
  • test_embed_content_drops_max_tokens — integration test for embedContent path

Changed files

  • litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py (modified, +15/-10)
  • tests/litellm/llms/vertex_ai/test_gemini_batch_embeddings.py (modified, +48/-0)

Code Example

litellm.BadRequestError: GeminiException BadRequestError - {
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "requests[0]",
            "description": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field."
          }
        ]
      }
    ]
  }
}

---

File "/usr/lib/python3.13/site-packages/litellm/main.py", line 4585, in aembedding
    response = await init_response
File "/usr/lib/python3.13/site-packages/litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py", line 322, in async_batch_embeddings
    response = await async_handler.post(...)
File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post
    raise e
File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post
    response.raise_for_status()

---

litellm_settings:
  drop_params: true   # Note: drop_params is enabled but doesn't prevent this

---

image: ghcr.io/berriai/litellm:main-v1.81.9-stable
RAW_BUFFERClick to expand / collapse

What happened?

When calling Gemini embeddings through the LiteLLM proxy, the batch embedding handler sends a max_tokens parameter in the request payload. The Gemini API does not accept max_tokens for embedding requests, returning a 400 BadRequestError.

This is a regression — v1.81.9 works correctly, but the latest main-stable (v1.82.3 pulled on 2026-03-21) sends the invalid parameter.

Error

litellm.BadRequestError: GeminiException BadRequestError - {
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "field": "requests[0]",
            "description": "Invalid JSON payload received. Unknown name \"max_tokens\" at 'requests[0]': Cannot find field."
          }
        ]
      }
    ]
  }
}

Stack Trace

File "/usr/lib/python3.13/site-packages/litellm/main.py", line 4585, in aembedding
    response = await init_response
File "/usr/lib/python3.13/site-packages/litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py", line 322, in async_batch_embeddings
    response = await async_handler.post(...)
File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post
    raise e
File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post
    response.raise_for_status()

Relevant Code

The issue is in litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py. The max_tokens parameter is being included in the batch embedding request payload sent to the Gemini API, but this field is not part of the Gemini Batch Embed Content API schema.

How to Reproduce

  1. Configure LiteLLM proxy with a Gemini embedding model (e.g., gemini/text-embedding-004)
  2. Send an embedding request through the proxy
  3. The request fails with the above 400 BadRequestError

LiteLLM Config

litellm_settings:
  drop_params: true   # Note: drop_params is enabled but doesn't prevent this

Note: Even with drop_params: true in litellm_settings, the error still occurs — suggesting the parameter is added after the drop_params logic runs, or drop_params doesn't apply to the embedding batch handler.

Expected Behavior

The batch embedding request should NOT include max_tokens in the payload sent to the Gemini API. Embedding endpoints don't use max_tokens.

Environment

  • LiteLLM version (broken): latest main-stable (pulled 2026-03-21, Python 3.13)
  • LiteLLM version (working): v1.81.9
  • Docker image: ghcr.io/berriai/litellm:main-stable
  • Embedding model: gemini/text-embedding-004
  • Provider: Google Gemini (not Vertex AI)

Workaround

Pin the LiteLLM image to v1.81.9:

image: ghcr.io/berriai/litellm:main-v1.81.9-stable

extent analysis

Fix Plan

To fix the issue, we need to remove the max_tokens parameter from the batch embedding request payload sent to the Gemini API.

Here are the steps:

  • Locate the batch_embed_content_handler.py file in litellm/llms/vertex_ai/gemini_embeddings/.
  • Find the function async_batch_embeddings where the request payload is constructed.
  • Remove the max_tokens parameter from the payload.

Example code snippet:

# Before
payload = {
    'requests': [
        {
            'input': input_text,
            'max_tokens': max_tokens  # Remove this line
        }
    ]
}

# After
payload = {
    'requests': [
        {
            'input': input_text
        }
    ]
}

Alternatively, you can modify the drop_params logic to exclude the max_tokens parameter from the payload.

Verification

To verify that the fix worked, you can:

  • Send an embedding request through the LiteLLM proxy.
  • Check the request payload sent to the Gemini API to ensure it does not include the max_tokens parameter.
  • Verify that the request is successful and returns the expected response.

Extra Tips

  • Make sure to test the fix thoroughly to ensure it does not introduce any new issues.
  • Consider adding a check to ensure that the max_tokens parameter is not included in the payload for embedding requests.
  • If you are using a version control system, make sure to commit the changes and update the documentation accordingly.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING