litellm - ✅(Solved) Fix [Bug]: gemini-embedding-2-preview does not return embeddings separatelly [2 pull requests, 3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24209Fetched 2026-04-08 01:09:24
View on GitHub
Comments
3
Participants
2
Timeline
13
Reactions
0
Author
Participants
Timeline (top)
commented ×3labeled ×3cross-referenced ×2subscribed ×2

Fix Action

Fixed

PR fix notes

PR #24337: fix(gemini): return separate embeddings for multimodal inputs

Description (problem / solution / changelog)

Relevant issues

Fixes #24209

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

When passing multiple multimodal inputs (text + images, audio, etc.) to the Gemini embedding endpoint, LiteLLM returned 1 combined embedding instead of N separate embeddings. Multimodal inputs were incorrectly routed to the embedContent endpoint (which aggregates all parts into one vector) instead of keeping them as separate requests. Also fixes embedding response indices always being 0 instead of incrementing per input.

Changed files

  • litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_handler.py (modified, +19/-4)
  • litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py (modified, +67/-39)
  • tests/test_litellm/llms/vertex_ai/gemini_embeddings/__init__.py (added, +0/-0)
  • tests/test_litellm/llms/vertex_ai/gemini_embeddings/test_batch_embed_content_transformation.py (added, +231/-0)

PR #24341: feat(gemini): support combined multimodal embeddings via nested input

Description (problem / solution / changelog)

Relevant issues

Related to #24209

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🆕 New Feature

Changes

Gemini's embedContent endpoint can combine multiple inputs (text + image) into a single embedding vector. Previously this was only possible accidentally through a bug. Now users can opt-in to combined embeddings by using nested lists in the input parameter:

# Separate (default, OpenAI-compatible): N inputs → N embeddings
input=["text", "image"]2 embeddings

# Combined (new): nested list → 1 combined embedding
input=[["text", "image"]]1 embedding

# Mixed
input=[["text", "image"], "x"]2 embeddings (1 combined + 1 separate)

Flat list input continues to work exactly as before — each element produces its own embedding. Tested via SDK and proxy.

Changed files

  • docs/my-website/docs/embedding/supported_embedding.md (modified, +50/-0)
  • litellm/llms/vertex_ai/gemini_embeddings/batch_embed_content_transformation.py (modified, +59/-23)
  • litellm/types/llms/openai.py (modified, +1/-1)
  • tests/test_litellm/llms/vertex_ai/gemini_embeddings/test_batch_embed_content_transformation.py (modified, +63/-0)

Code Example

curl -X POST http://localhost:4000/embeddings \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "dimensions":2,
    "model": "gemini-embedding-2-preview",
    "input": [
      "The food was delicious and the waiter...",
      "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
    ]
  }'

---

{"model":"gemini-embedding-2-preview","data":[{"embedding":[0.99655586,-0.08292416],"index":0,"object":"embedding"}],"object":"list","usage":{"completion_tokens":0,"prompt_tokens":0,"total_tokens":0,"completion_tokens_details":null,"prompt_tokens_details":null}}

---

The important part is that the embeddings array list should match the list of elements (2 for this case). It is true that this new embeddings can aggregate a list in one unique embeddings, but should be explicitly set, by default should work as starndard batched embeddings.
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Using the standard endpoint to get gemini-embedding-2-preview embeddings it returns a unique elements insted of a list of embeddings.

Steps to Reproduce

using this curl

curl -X POST http://localhost:4000/embeddings \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "dimensions":2,
    "model": "gemini-embedding-2-preview",
    "input": [
      "The food was delicious and the waiter...",
      "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
    ]
  }'

it returns as follows

{"model":"gemini-embedding-2-preview","data":[{"embedding":[0.99655586,-0.08292416],"index":0,"object":"embedding"}],"object":"list","usage":{"completion_tokens":0,"prompt_tokens":0,"total_tokens":0,"completion_tokens_details":null,"prompt_tokens_details":null}}

Notice it only return one element for the embeddings.

Relevant log output

The important part is that the embeddings array list should match the list of elements (2 for this case). It is true that this new embeddings can aggregate a list in one unique embeddings, but should be explicitly set, by default should work as starndard batched embeddings.

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

v1.82.3

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue of the endpoint returning a unique element instead of a list of embeddings, you need to modify the request to explicitly specify that you want batched embeddings.

Here are the steps:

  • Update the curl command to include a new parameter batched and set it to true.
  • Modify the Python code using the LiteLLM SDK to include this new parameter.

Example Code

The updated curl command would look like this:

curl -X POST http://localhost:4000/embeddings \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "dimensions":2,
    "model": "gemini-embedding-2-preview",
    "input": [
      "The food was delicious and the waiter...",
      "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
    ],
    "batched": true
  }'

And the Python code using the LiteLLM SDK would be:

import litellm

# Initialize the client
client = litellm.Client("sk-1234")

# Create the request
request = {
    "dimensions": 2,
    "model": "gemini-embedding-2-preview",
    "input": [
        "The food was delicious and the waiter...",
        "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
    ],
    "batched": True
}

# Send the request
response = client.embeddings(request)

# Print the response
print(response)

Verification

To verify that the fix worked, check the response from the endpoint. It should now return a list of embeddings, one for each input element. The data field in the response should contain a list of objects, each with an embedding field.

For example:

{
  "model": "gemini-embedding-2-preview",
  "data": [
    {
      "embedding": [0.99655586, -0.08292416],
      "index": 0,
      "object": "embedding"
    },
    {
      "embedding": [0.12345678, 0.90123456],
      "index": 1,

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - ✅(Solved) Fix [Bug]: gemini-embedding-2-preview does not return embeddings separatelly [2 pull requests, 3 comments, 2 participants]