litellm - ✅(Solved) Fix [Bug]: `vertex_ai/gemini-embedding-2-preview` routes to `:predict` endpoint, returns `FAILED_PRECONDITION` [4 pull requests, 2 comments, 2 participants]

chiruno-9 · 2026-03-13T03:33:36Z

[litellm] PR 23518: fix: route vertex ai gemini-embedding-2-preview to correct endpoint - Repository: BerriAI/litellm - Author: alvinttang - State: open | merg… # PR #23518: fix: route vertex_ai gemini-embedding-2-preview to correct endpoint - Repository: BerriAI/litellm - Author: alvinttang - State: open | merged: False - Link: https://github.com/BerriAI/litellm/pull/23518 ## Description (problem / solution / changelog) ## Summary - Adds a model-name fallback so that `vertex_ai/gemini-embedding-*` models always route to the `embedContent` endpoint (via `GoogleBatchEmbeddings`) instead of the legacy `:predict` endpoint. - When `get_model_info()` cannot resolve the `uses_embed_content` flag (e.g. stale or missing model cost map), the code now checks if the model name starts with `gemini-embedding` and sets `uses_embed_content = True` as a fallback. - Fix applied in both `litellm/main.py` (embedding dispatch) and `litellm/llms/vertex_ai/common_utils.py` (URL construction). Fixes #23508 ## Motivation `vertex_ai/gemini-embedding-2-preview` returns `400 FAILED_PRECONDITION` because the legacy `:predict` endpoint is not supported for newer `gemini-embedding-*` models. PR #23322 added routing via `uses_embed_content` in the model cost map, but when model info lookup fails (exception path), requests still fall through to `:predict`. This adds a defensive name-based check so routing is correct regardless of model cost map state. ## Test plan - [ ] Existing test `test_vertex_ai_text_only_embedding_uses_embed_content` continues to pass - [ ] Verify `litellm.embedding(model="vertex_ai/gemini-embedding-2-preview", ...)` uses `:embedContent` endpoint - [ ] Verify `vertex_ai/gemini-embedding-001` (older model) still works via its existing path ## Changed files - `litellm/llms/vertex_ai/common_utils.py` (modified, +5/-0) - `litellm/main.py` (modified, +5/-0) --- # PR #23520: fix(vertex_ai): fix embedding routing for gemini-embedding models - Repository: BerriAI/litellm - Author: gambletan - State: open | merged: False - Link: https://github.com/BerriAI/litellm/pull/23520 ## Description (problem / solution / changelog) ## Summary Fixes #23508 — `vertex_ai/gemini-embedding-2-preview` routes to `:predict` endpoint and returns `FAILED_PRECONDITION`. **Two bugs fixed:** 1. **Hardcoded `/v1/` in `_get_embedding_url`** (`common_utils.py` line 273): The non-digit model branch ignored the `vertex_api_version` parameter and always used `/v1/`, while the digit branch correctly used `{vertex_api_version}`. Fixed to use the parameter consistently. 2. **Silent fallthrough when `get_model_info` fails**: Both `_get_embedding_url` and `main.py` routing wrap `get_model_info` in a bare `except Exception` that defaults `uses_embed_content=False`, causing the request to fall through to the legacy `:predict` handler. Added a model-name fallback: any model starting with `gemini-embedding` is routed to the `embedContent` endpoint and `GoogleBatchEmbeddings` handler, matching Google's API design where these models only support `embedContent`, not `:predict`. ## Changes - `litellm/llms/vertex_ai/common_utils.py`: Fix `_get_embedding_url` to use `vertex_api_version` instead of hardcoded `/v1/`; add `gemini-embedding` prefix fallback for `uses_embed_content` - `litellm/main.py`: Add `gemini-embedding` prefix fallback in `vertex_ai` embedding routing so models route to `GoogleBatchEmbeddings` even if model info lookup fails ## Test plan - [ ] Verify `vertex_ai/gemini-embedding-2-preview` text-only embedding calls use `:embedContent` endpoint - [ ] Verify `vertex_ai/gemini-embedding-001` still uses `:predict` endpoint (no regression) - [ ] Verify `gemini/gemini-embedding-2-preview` still works via Gemini API path - [ ] Existing tests in `tests/litellm/llms/vertex_ai/test_gemini_batch_embeddings.py` pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) ## Changed files - `litellm/llms/vertex_ai/common_utils.py` (modified, +7/-1) - `litellm/main.py` (modified, +6/-0) --- # PR #23620: test: add regression test for vertex_ai embedding-2 routing - Repository: BerriAI/litellm - Author: gambletan - State: open | merged: False - Link: https://github.com/BerriAI/litellm/pull/23620 ## Description (problem / solution / changelog) ## Summary Adds a regression test for #23508 that verifies `vertex_ai/gemini-embedding-2-preview` is routed through `GoogleBatchEmbeddings` (`:embedContent` endpoint) and does **not** fall through to the legacy `vertex_embedding` handler that uses `:predict`. The core routing fix was already merged in #23322, but there was no test specifically asserting: - The legacy `vertex_embedding.embedding()` handler is **not** called - The `:predict` endpoint is **not** used in the URL - The request body uses `embedContent` format (`content` key) rather than `predict` format (`instances` key) This test catches the exact failure mode described in the issue: Google dropped `:predict` support for `gemini-embedding-2-preview`, so the leg

litellm2026-03-13 03:33:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#23508•Fetched 2026-04-08 00:43:55

View on GitHub

Comments

Participants

Timeline

Reactions

Author

chiruno-9

Participants

chiruno-9

gambletan

Timeline (top)

referenced ×4cross-referenced ×3labeled ×3commented ×2

Error Message

Error: Vertex_aiException BadRequestError - {"error":{"code":400,"message":"Precondition check failed.","status":"FAILED_PRECONDITION"}}

The LiteLLM blog post is dated March 2025, but gemini-embedding-2-preview was released by Google on March 10, 2026. The date appears to be an error.

Root Cause

PR #23322 added GoogleBatchEmbeddings handler for this model, but only wired it up for custom_llm_provider == "gemini" (line 5140 in main.py). The vertex_ai path (line 5193) still falls through to vertex_embedding.embedding() which calls the legacy :predict endpoint:

POST .../models/gemini-embedding-2-preview:predict  → 400 FAILED_PRECONDITION

Note: vertex_ai/gemini-embedding-001 works with :predict, but Google dropped :predict support for the newer gemini-embedding-2-preview. The vertex_ai/ prefix needs to route through GoogleBatchEmbeddings as well.

Code Example

import litellm

litellm.embedding(
    model="vertex_ai/gemini-embedding-2-preview",
    input=["Hello, world!"],
    vertex_project="your-project-id",
    vertex_location="us-central1",
)

---

from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="us-central1")
result = client.models.embed_content(model="gemini-embedding-2-preview", contents="Hello, world!")
# works fine

---

POST .../models/gemini-embedding-2-preview:predict  → 400 FAILED_PRECONDITION

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Version

SDK: 1.82.1 (latest on PyPI)
Proxy: 1.81.0

Both SDK and Proxy are affected. The proxy health check also fails, marking the model as permanently unhealthy.

Reproduce

import litellm

litellm.embedding(
    model="vertex_ai/gemini-embedding-2-preview",
    input=["Hello, world!"],
    vertex_project="your-project-id",
    vertex_location="us-central1",
)

Error: Vertex_aiException BadRequestError - {"error":{"code":400,"message":"Precondition check failed.","status":"FAILED_PRECONDITION"}}

The same model works fine via google-genai SDK directly:

from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="us-central1")
result = client.models.embed_content(model="gemini-embedding-2-preview", contents="Hello, world!")
# works fine

Root Cause

POST .../models/gemini-embedding-2-preview:predict  → 400 FAILED_PRECONDITION

Additional Notes

The LiteLLM blog post is dated March 2025, but gemini-embedding-2-preview was released by Google on March 10, 2026. The date appears to be an error.

Steps to Reproduce

Described above.

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.81.0

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue, we need to update the litellm library to route the vertex_ai path through GoogleBatchEmbeddings for the gemini-embedding-2-preview model.

Step-by-Step Solution

Update the main.py file: Modify the main.py file to include the vertex_ai path in the GoogleBatchEmbeddings handler.
Add a conditional statement: Add a conditional statement to check if the model is gemini-embedding-2-preview and if the provider is vertex_ai.
Use the GoogleBatchEmbeddings handler: If the condition is met, use the GoogleBatchEmbeddings handler instead of the legacy :predict endpoint.

Example Code

if custom_llm_provider == "gemini" or (custom_llm_provider == "vertex_ai" and model == "gemini-embedding-2-preview"):
    # Use GoogleBatchEmbeddings handler
    return GoogleBatchEmbeddings(model, input, vertex_project, vertex_location)
else:
    # Use the legacy :predict endpoint
    return vertex_embedding.embedding(model, input, vertex_project, vertex_location)

Verification

To verify that the fix worked, run the following code:

import litellm

litellm.embedding(
    model="vertex_ai/gemini-embedding-2-preview",
    input=["Hello, world!"],
    vertex_project="your-project-id",
    vertex_location="us-central1",
)

If the fix is successful, the code should execute without any errors and return the expected output.

Extra Tips

Make sure to update the litellm library to the latest version after applying the fix.
If you encounter any issues, check the LiteLLM documentation and GitHub repository for any updates or known issues related to the gemini-embedding-2-preview model.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #model save/load #optimization #mixed precision #training loop #device allocation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: `vertex_ai/gemini-embedding-2-preview` routes to `:predict` endpoint, returns `FAILED_PRECONDITION` [4 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #23518: fix: route vertex_ai gemini-embedding-2-preview to correct endpoint

Description (problem / solution / changelog)

Summary

Motivation

Test plan

Changed files

PR #23520: fix(vertex_ai): fix embedding routing for gemini-embedding models

Description (problem / solution / changelog)

Summary

Changes

Test plan

Changed files

PR #23620: test: add regression test for vertex_ai embedding-2 routing

Description (problem / solution / changelog)

Summary

Test plan

Changed files

PR #23322: [Feat]: Add support for gemini embedding 2 preview

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Changed files

Code Example

Check for existing issues

What happened?

Version

Reproduce

Root Cause

Additional Notes

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

Fix Plan

Step-by-Step Solution

Example Code

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING