litellm - ✅(Solved) Fix [Bug]: gemini-embedding-2-preview does not return embeddings separatelly [2 pull requests, 3 comments, 2 participants]

litellm2026-03-20 12:33:55

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24209•Fetched 2026-04-08 01:09:24

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ladrians

Participants

ladrians

mvrodrig

Timeline (top)

commented ×3labeled ×3cross-referenced ×2subscribed ×2

Fix Action

Fixed

Fixed by PR: fix(gemini): return separate embeddings for multimodal inputs (https://github.com/BerriAI/litellm/pull/24337)
Fixed by PR: feat(gemini): support combined multimodal embeddings via nested input (https://github.com/BerriAI/litellm/pull/24341)

PR fix notes

PR #24337: fix(gemini): return separate embeddings for multimodal inputs

Repository: BerriAI/litellm
Author: Chesars
State: closed | merged: True
Link: https://github.com/BerriAI/litellm/pull/24337

Description (problem / solution / changelog)

Relevant issues

Fixes #24209

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

When passing multiple multimodal inputs (text + images, audio, etc.) to the Gemini embedding endpoint, LiteLLM returned 1 combined embedding instead of N separate embeddings. Multimodal inputs were incorrectly routed to the embedContent endpoint (which aggregates all parts into one vector) instead of keeping them as separate requests. Also fixes embedding response indices always being 0 instead of incrementing per input.

Changed files

litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py (modified, +19/-4)
litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py (modified, +67/-39)
tests/test_litellm/llms/vertex_ai/gemini_embeddings/__init__.py (added, +0/-0)
tests/test_litellm/llms/vertex_ai/gemini_embeddings/test_batch_embed_content_transformation.py (added, +231/-0)

PR #24341: feat(gemini): support combined multimodal embeddings via nested input

Repository: BerriAI/litellm
Author: Chesars
State: closed | merged: True
Link: https://github.com/BerriAI/litellm/pull/24341

Description (problem / solution / changelog)

Relevant issues

Related to #24209

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🆕 New Feature

Changes

Gemini's embedContent endpoint can combine multiple inputs (text + image) into a single embedding vector. Previously this was only possible accidentally through a bug. Now users can opt-in to combined embeddings by using nested lists in the input parameter:

# Separate (default, OpenAI-compatible): N inputs → N embeddings
input=["text", "image"]         → 2 embeddings

# Combined (new): nested list → 1 combined embedding
input=[["text", "image"]]       → 1 embedding

# Mixed
input=[["text", "image"], "x"]  → 2 embeddings (1 combined + 1 separate)

Flat list input continues to work exactly as before — each element produces its own embedding. Tested via SDK and proxy.

Changed files

docs/my-website/docs/embedding/supported_embedding.md (modified, +50/-0)
litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py (modified, +59/-23)
litellm/types/llms/openai.py (modified, +1/-1)
tests/test_litellm/llms/vertex_ai/gemini_embeddings/test_batch_embed_content_transformation.py (modified, +63/-0)

Code Example

curl -X POST http://localhost:4000/embeddings \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "dimensions":2,
    "model": "gemini-embedding-2-preview",
    "input": [
      "The food was delicious and the waiter...",
      "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
    ]
  }'

---

{"model":"gemini-embedding-2-preview","data":[{"embedding":[0.99655586,-0.08292416],"index":0,"object":"embedding"}],"object":"list","usage":{"completion_tokens":0,"prompt_tokens":0,"total_tokens":0,"completion_tokens_details":null,"prompt_tokens_details":null}}

---

The important part is that the embeddings array list should match the list of elements (2 for this case). It is true that this new embeddings can aggregate a list in one unique embeddings, but should be explicitly set, by default should work as starndard batched embeddings.

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Using the standard endpoint to get gemini-embedding-2-preview embeddings it returns a unique elements insted of a list of embeddings.

Steps to Reproduce

using this curl

curl -X POST http://localhost:4000/embeddings \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "dimensions":2,
    "model": "gemini-embedding-2-preview",
    "input": [
      "The food was delicious and the waiter...",
      "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
    ]
  }'

it returns as follows

{"model":"gemini-embedding-2-preview","data":[{"embedding":[0.99655586,-0.08292416],"index":0,"object":"embedding"}],"object":"list","usage":{"completion_tokens":0,"prompt_tokens":0,"total_tokens":0,"completion_tokens_details":null,"prompt_tokens_details":null}}

Notice it only return one element for the embeddings.

Relevant log output

The important part is that the embeddings array list should match the list of elements (2 for this case). It is true that this new embeddings can aggregate a list in one unique embeddings, but should be explicitly set, by default should work as starndard batched embeddings.

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

v1.82.3

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue of the endpoint returning a unique element instead of a list of embeddings, you need to modify the request to explicitly specify that you want batched embeddings.

Here are the steps:

Update the curl command to include a new parameter batched and set it to true.
Modify the Python code using the LiteLLM SDK to include this new parameter.

Example Code

The updated curl command would look like this:

curl -X POST http://localhost:4000/embeddings \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "dimensions":2,
    "model": "gemini-embedding-2-preview",
    "input": [
      "The food was delicious and the waiter...",
      "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
    ],
    "batched": true
  }'

And the Python code using the LiteLLM SDK would be:

import litellm

# Initialize the client
client = litellm.Client("sk-1234")

# Create the request
request = {
    "dimensions": 2,
    "model": "gemini-embedding-2-preview",
    "input": [
        "The food was delicious and the waiter...",
        "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
    ],
    "batched": True
}

# Send the request
response = client.embeddings(request)

# Print the response
print(response)

Verification

To verify that the fix worked, check the response from the endpoint. It should now return a list of embeddings, one for each input element. The data field in the response should contain a list of objects, each with an embedding field.

For example:

{
  "model": "gemini-embedding-2-preview",
  "data": [
    {
      "embedding": [0.99655586, -0.08292416],
      "index": 0,
      "object": "embedding"
    },
    {
      "embedding": [0.12345678, 0.90123456],
      "index": 1,

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #SSR setup #ISR setup #authentication setup #request error #file not found

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: gemini-embedding-2-preview does not return embeddings separatelly [2 pull requests, 3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #24337: fix(gemini): return separate embeddings for multimodal inputs

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Type

Changes

Changed files

PR #24341: feat(gemini): support combined multimodal embeddings via nested input

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Type

Changes

Changed files

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

Fix Plan

Example Code

Verification

Still need to ship something?

RELATED_DISCOVERY

TRENDING