litellm - ✅(Solved) Fix [Bug]: Gemini custom api_base with full model route gets duplicate path appended [4 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#26979Fetched 2026-05-02 05:28:12
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×4labeled ×1

Fix Action

Fixed

PR fix notes

PR #24786: fix(gemini): prevent duplicate model path generation when api_base includes full route

Description (problem / solution / changelog)

Fixes https://github.com/OpenHands/OpenHands/issues/13645 by checking if api_base already contains the full proxy route to avoid appending it twice in _check_custom_proxy.

Changed files

  • litellm/llms/vertex_ai/vertex_llm_base.py (modified, +6/-1)

PR #26980: fix(gemini): avoid duplicate model route for full api_base

Description (problem / solution / changelog)

Summary

  • avoid appending a duplicate /models/{model}:{endpoint} route when Gemini api_base already includes the full route
  • handle api_base values that already end with /models/{model} by appending only :{endpoint}
  • add regression coverage for full-route and model-prefix custom api_base cases

Context

This is a follow-up/replacement for #24786, which has been inactive since 2026-03-30.

The bug is still affecting downstream integrations in OpenHands:

  • OpenHands/OpenHands#14028
  • OpenHands/OpenHands#14029

Tracking issue:

  • Closes #26979

Test Plan

  • uv run pytest tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py -k 'custom_proxy or full_route or api_base'
  • uv run ruff check litellm/llms/vertex_ai/vertex_llm_base.py tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py

Changed files

  • litellm/llms/vertex_ai/vertex_llm_base.py (modified, +8/-1)
  • litellm/router_strategy/tag_based_routing.py (modified, +2/-1)
  • tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py (modified, +41/-5)
  • tests/test_litellm/router_strategy/test_router_tag_routing.py (modified, +28/-0)

PR #26982: fix(gemini): avoid duplicate model route for full api_base

Description (problem / solution / changelog)

Summary

  • avoid appending a duplicate /models/{model}:{endpoint} route when Gemini api_base already includes the full route
  • handle api_base values that already end with /models/{model} by appending only :{endpoint}
  • add regression coverage for full-route and model-prefix custom api_base cases

Context

This retargets the same fix through the OSS staging path after the direct-to-main PR hit the repository's main-branch source guard.

Prior PR / context:

  • supersedes #26980 for merge routing purposes
  • follow-up/replacement for #24786, which has been inactive since 2026-03-30

The bug is still affecting downstream integrations in OpenHands:

  • OpenHands/OpenHands#14028
  • OpenHands/OpenHands#14029

Tracking issue:

  • Closes #26979

Test Plan

  • uv run pytest tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py -k 'custom_proxy or full_route or api_base or credential_project_validation'
  • uv run ruff check litellm/llms/vertex_ai/vertex_llm_base.py tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py

Changed files

  • .github/workflows/check-lazy-openapi-snapshot.yml (added, +75/-0)
  • litellm-proxy-extras/pyproject.toml (modified, +2/-2)
  • litellm/caching/dual_cache.py (modified, +5/-0)
  • litellm/caching/redis_cache.py (modified, +6/-4)
  • litellm/constants.py (modified, +1/-0)
  • litellm/litellm_core_utils/initialize_dynamic_callback_params.py (modified, +13/-4)
  • litellm/litellm_core_utils/redact_messages.py (modified, +59/-43)
  • litellm/llms/anthropic/chat/transformation.py (modified, +48/-22)
  • litellm/llms/base_llm/rerank/transformation.py (modified, +1/-0)
  • litellm/llms/bedrock/base_aws_llm.py (modified, +30/-0)
  • litellm/llms/cohere/rerank/transformation.py (modified, +1/-0)
  • litellm/llms/cohere/rerank_v2/transformation.py (modified, +1/-0)
  • litellm/llms/custom_httpx/llm_http_handler.py (modified, +20/-6)
  • litellm/llms/deepinfra/rerank/transformation.py (modified, +1/-0)
  • litellm/llms/fireworks_ai/rerank/transformation.py (modified, +1/-0)
  • litellm/llms/hosted_vllm/rerank/transformation.py (modified, +1/-0)
  • litellm/llms/huggingface/rerank/transformation.py (modified, +1/-0)
  • litellm/llms/jina_ai/rerank/transformation.py (modified, +5/-1)
  • litellm/llms/milvus/vector_stores/transformation.py (modified, +9/-3)
  • litellm/llms/nvidia_nim/rerank/ranking_transformation.py (modified, +2/-0)
  • litellm/llms/nvidia_nim/rerank/transformation.py (modified, +1/-0)
  • litellm/llms/vertex_ai/common_utils.py (modified, +47/-0)
  • litellm/llms/vertex_ai/gemini/transformation.py (modified, +3/-10)
  • litellm/llms/vertex_ai/image_generation/vertex_imagen_transformation.py (modified, +9/-3)
  • litellm/llms/vertex_ai/rerank/transformation.py (modified, +9/-1)
  • litellm/llms/vertex_ai/vertex_embeddings/embedding_handler.py (modified, +12/-3)
  • litellm/llms/vertex_ai/vertex_embeddings/transformation.py (modified, +22/-5)
  • litellm/llms/vertex_ai/vertex_embeddings/types.py (modified, +2/-1)
  • litellm/llms/vertex_ai/vertex_llm_base.py (modified, +8/-1)
  • litellm/llms/voyage/rerank/transformation.py (modified, +5/-1)
  • litellm/llms/watsonx/rerank/transformation.py (modified, +1/-0)
  • litellm/main.py (modified, +1/-0)
  • litellm/proxy/_experimental/mcp_server/db.py (modified, +99/-44)
  • litellm/proxy/_experimental/mcp_server/discoverable_endpoints.py (modified, +26/-11)
  • litellm/proxy/_experimental/mcp_server/mcp_server_manager.py (modified, +73/-17)
  • litellm/proxy/_lazy_features.py (added, +432/-0)
  • litellm/proxy/_lazy_openapi_snapshot.json (added, +31651/-0)
  • litellm/proxy/_lazy_openapi_snapshot.py (added, +70/-0)
  • litellm/proxy/_types.py (modified, +9/-4)
  • litellm/proxy/auth/user_api_key_auth.py (modified, +6/-1)
  • litellm/proxy/client/README.md (modified, +17/-16)
  • litellm/proxy/client/cli/commands/auth.py (modified, +39/-18)
  • litellm/proxy/common_utils/callback_utils.py (modified, +14/-3)
  • litellm/proxy/common_utils/reset_budget_job.py (modified, +85/-35)
  • litellm/proxy/common_utils/static_asset_utils.py (added, +52/-0)
  • litellm/proxy/db/spend_counter_reseed.py (modified, +9/-5)
  • litellm/proxy/google_endpoints/endpoints.py (modified, +68/-103)
  • litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py (modified, +9/-1)
  • litellm/proxy/guardrails/guardrail_hooks/pillar/pillar.py (modified, +2/-0)
  • litellm/proxy/hooks/key_management_event_hooks.py (modified, +24/-12)
  • litellm/proxy/hooks/user_management_event_hooks.py (modified, +9/-3)
  • litellm/proxy/litellm_pre_call_utils.py (modified, +170/-12)
  • litellm/proxy/management_endpoints/internal_user_endpoints.py (modified, +8/-3)
  • litellm/proxy/management_endpoints/key_management_endpoints.py (modified, +29/-7)
  • litellm/proxy/management_endpoints/mcp_management_endpoints.py (modified, +13/-2)
  • litellm/proxy/management_endpoints/team_callback_endpoints.py (modified, +133/-1)
  • litellm/proxy/management_endpoints/team_endpoints.py (modified, +83/-28)
  • litellm/proxy/management_endpoints/ui_sso.py (modified, +296/-53)
  • litellm/proxy/management_helpers/audit_logs.py (modified, +26/-2)
  • litellm/proxy/pass_through_endpoints/pass_through_endpoints.py (modified, +8/-7)
  • litellm/proxy/proxy_server.py (modified, +353/-229)
  • litellm/rerank_api/main.py (modified, +21/-8)
  • litellm/router_strategy/tag_based_routing.py (modified, +2/-1)
  • litellm/types/router.py (modified, +2/-0)
  • pyproject.toml (modified, +1/-1)
  • tests/litellm_utils_tests/test_utils.py (modified, +1/-1)
  • tests/llm_translation/test_anthropic_completion.py (modified, +1/-1)
  • tests/logging_callback_tests/test_logging_redaction_e2e_test.py (modified, +37/-41)
  • tests/otel_tests/test_e2e_model_access.py (modified, +25/-5)
  • tests/proxy_unit_tests/test_audit_logs_proxy.py (modified, +118/-2)
  • tests/proxy_unit_tests/test_get_favicon.py (modified, +20/-41)
  • tests/proxy_unit_tests/test_get_image.py (modified, +16/-53)
  • tests/proxy_unit_tests/test_proxy_routes.py (modified, +17/-0)
  • tests/proxy_unit_tests/test_proxy_server.py (modified, +8/-4)
  • tests/proxy_unit_tests/test_proxy_utils.py (modified, +42/-10)
  • tests/test_litellm/caching/test_redis_cache.py (modified, +44/-0)
  • tests/test_litellm/litellm_core_utils/test_redact_messages.py (modified, +207/-3)
  • tests/test_litellm/llms/anthropic/chat/test_anthropic_chat_transformation.py (modified, +34/-0)
  • tests/test_litellm/llms/bedrock/test_base_aws_llm.py (modified, +134/-0)
  • tests/test_litellm/llms/custom_httpx/test_llm_http_handler.py (modified, +73/-1)
  • tests/test_litellm/llms/vertex_ai/image_generation/test_vertex_ai_image_generation_transformation.py (modified, +14/-0)
  • tests/test_litellm/llms/vertex_ai/rerank/test_vertex_ai_rerank_transformation.py (modified, +16/-0)
  • tests/test_litellm/llms/vertex_ai/rerank/test_vertex_ai_rerank_userlabels_e2e.py (added, +182/-0)
  • tests/test_litellm/llms/vertex_ai/test_vertex_ai_common_utils.py (modified, +64/-0)
  • tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py (modified, +41/-5)
  • tests/test_litellm/proxy/_experimental/mcp_server/test_db_credentials.py (added, +402/-0)
  • tests/test_litellm/proxy/_experimental/mcp_server/test_discoverable_endpoints.py (modified, +259/-5)
  • tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server_manager.py (modified, +370/-6)
  • tests/test_litellm/proxy/auth/test_cli_auth.py (modified, +26/-6)
  • tests/test_litellm/proxy/auth/test_onboarding.py (modified, +257/-23)
  • tests/test_litellm/proxy/client/cli/test_auth_commands.py (modified, +52/-16)
  • tests/test_litellm/proxy/common_utils/test_reset_budget_job.py (modified, +156/-0)
  • tests/test_litellm/proxy/common_utils/test_static_asset_utils.py (added, +97/-0)
  • tests/test_litellm/proxy/google_endpoints/test_google_api_endpoints.py (modified, +100/-449)
  • tests/test_litellm/proxy/guardrails/guardrail_hooks/test_bedrock_guardrails.py (modified, +54/-0)
  • tests/test_litellm/proxy/guardrails/test_pillar_guardrails.py (modified, +32/-0)
  • tests/test_litellm/proxy/management_endpoints/test_team_callback_endpoints.py (added, +294/-0)
  • tests/test_litellm/proxy/management_endpoints/test_team_endpoints.py (modified, +302/-7)
  • tests/test_litellm/proxy/management_endpoints/test_ui_sso.py (modified, +320/-65)
  • tests/test_litellm/proxy/pass_through_endpoints/test_passthrough_auth_default.py (added, +136/-0)

PR #26983: fix(gemini): avoid duplicate model route for full api_base

Description (problem / solution / changelog)

Summary

  • avoid appending a duplicate /models/{model}:{endpoint} route when Gemini api_base already includes the full route
  • handle api_base values that already point at /models/{model} prefixes without duplicating the model path
  • add regression coverage for both full-route and full-model-prefix custom api_base handling

Validation

  • uv run pytest tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py -k 'custom_proxy or full_route or api_base or credential_project_validation'
  • uv run ruff check litellm/llms/vertex_ai/vertex_llm_base.py tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py

Context

This is the clean OSS-staging re-open of the same fix after identifying that the previous PR routing was wrong and the branch ancestry pulled in unrelated diff.

Related:

  • Closes #26979
  • Continues #24786
  • OpenHands/OpenHands#14028
  • OpenHands/OpenHands#14029
  • OpenHands/docs#465

Changed files

  • litellm/llms/vertex_ai/vertex_llm_base.py (modified, +8/-1)
  • tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py (modified, +41/-5)

Code Example

import litellm

litellm.completion(
    model="gemini/gemini-2.5-flash-lite",
    messages=[{"role": "user", "content": "ping"}],
    api_base="https://proxy.example.com/generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-lite:generateContent",
    api_key="test-key",
)
RAW_BUFFERClick to expand / collapse

Describe the bug

When model="gemini/..." is used with a custom api_base that already includes the full Gemini route (or the /models/{model} prefix), LiteLLM appends the model route again inside _check_custom_proxy(), producing an invalid URL.

This showed up downstream in OpenHands as:

  • OpenHands/OpenHands#14028
  • OpenHands/OpenHands#14029

Example shape of the bad duplication:

  • input api_base already contains something like .../v1beta/models/gemini-2.5-flash-lite:generateContent
  • LiteLLM builds .../v1beta/models/gemini-2.5-flash-lite:generateContent/models/gemini-2.5-flash-lite:generateContent

Why I'm opening this issue even though there are related threads

I found related items, but they are not the same thing:

  • BerriAI/litellm#24786 — this appears to be the exact code fix, but it has been open/stale since 2026-03-30 and does not seem to have landed.
  • BerriAI/litellm#4317 — older general custom api_base / Vertex AI thread, not this exact duplicated full-route regression.
  • BerriAI/litellm#23846 — adjacent Gemini custom api_base issue, but specifically about context caching + model=None.
  • BerriAI/litellm#26724 — adjacent feature request about detecting URL-templated providers, not this concrete duplicate-path bug.
  • BerriAI/litellm#26702 — adjacent auth behavior (gemini_api_key vs azure_ad_token), not this route duplication bug.

So I am not treating those as duplicates to close over this report. This issue is meant to explicitly connect the still-reproducible downstream bug with the already-existing stale fix path.

Repro

A representative failing pattern is:

import litellm

litellm.completion(
    model="gemini/gemini-2.5-flash-lite",
    messages=[{"role": "user", "content": "ping"}],
    api_base="https://proxy.example.com/generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-lite:generateContent",
    api_key="test-key",
)

The provider should treat that as an already-complete route (or at least detect /models/{model} and avoid appending the model path twice).

Expected behavior

For Gemini custom api_base, _check_custom_proxy() should:

  1. use api_base as-is if it already ends with /models/{model}:{endpoint}
  2. append only :{endpoint} if it already ends with /models/{model}
  3. otherwise construct {api_base.rstrip('/')}/models/{model}:{endpoint}

Relevant code path

litellm/llms/vertex_ai/vertex_llm_base.py in _check_custom_proxy().

Note on existing fix

The logic in #24786 looks like the right direction, but it appears to be missing targeted regression coverage for the exact endswith('/models/{model}:{endpoint}') and endswith('/models/{model}') branches.

Environment

  • LiteLLM version: still reproducible downstream as of 2026-05-01
  • Provider: gemini/... with custom api_base
  • Affected downstream integration: OpenHands/OpenHands#14029

extent analysis

TL;DR

The most likely fix is to modify the _check_custom_proxy() function in litellm/llms/vertex_ai/vertex_llm_base.py to correctly handle custom api_base URLs that already include the full Gemini route or the /models/{model} prefix.

Guidance

  • Review the existing fix in #24786 and consider applying it, as it appears to address the issue.
  • Verify that the custom api_base URL is correctly formatted and does not already include the duplicate path.
  • Test the _check_custom_proxy() function with different input scenarios to ensure it handles all possible cases, including the endswith('/models/{model}:{endpoint}') and endswith('/models/{model}') branches.
  • Consider adding targeted regression coverage for these specific cases to prevent similar issues in the future.

Example

def _check_custom_proxy(api_base, model, endpoint):
    if api_base.endswith(f'/models/{model}:{endpoint}'):
        return api_base
    elif api_base.endswith(f'/models/{model}'):
        return f'{api_base}:{endpoint}'
    else:
        return f'{api_base.rstrip("/")}/models/{model}:{endpoint}'

Notes

The provided example code snippet is based on the expected behavior described in the issue and may need to be adapted to the actual implementation in litellm/llms/vertex_ai/vertex_llm_base.py.

Recommendation

Apply the workaround by modifying the _check_custom_proxy() function as described, and consider adding regression tests to ensure the fix is robust.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

For Gemini custom api_base, _check_custom_proxy() should:

  1. use api_base as-is if it already ends with /models/{model}:{endpoint}
  2. append only :{endpoint} if it already ends with /models/{model}
  3. otherwise construct {api_base.rstrip('/')}/models/{model}:{endpoint}

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - ✅(Solved) Fix [Bug]: Gemini custom api_base with full model route gets duplicate path appended [4 pull requests, 1 participants]