litellm - 💡(How to fix) Fix [Bug]: VertexAI - unable to use 3rd party model - Qwen 3.5 with dedicated endpoint [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#22965Fetched 2026-04-08 00:39:13
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×3subscribed ×1

Error Message

09:08:58 - LiteLLM Proxy:ERROR: common_request_processing.py:1121 - litellm.proxy.proxy_server._handle_llm_api_exception(): Exception occured - litellm.InternalServerError: Vertex_aiException InternalServerError - Cannot connect to host mg-endpoint-cedf4bc7-2e0e-466c-bdd5-8ab69ef4fa63.europe-west4-wmarusiak-homelab-ai.prediction.vertexai.goog:443 ssl:<ssl.SSLContext object at 0x7bf6f8bb6170> [Name or service not known]. Received Model Group=Qwen-3.5-Vertex-Final Traceback (most recent call last): The above exception was the direct cause of the following exception: Traceback (most recent call last): The above exception was the direct cause of the following exception: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): INFO: 172.26.0.26:38618 - "POST /chat/completions HTTP/1.1" 500 Internal Server Error

Code Example

I have some local ollama models which are working together with Open WebUI and one endpoint is VertexAI.

Docker Compose

---

LiteLLM config

---

I deployed in VertexAI dedicated endpoint qwen_qwen3_5-35b-a3b-mg-one-click-deploy. 

I cannot use my dedicated endpoint.

In your documentation [https://docs.litellm.ai/docs/providers/vertex_self_deployed](https://docs.litellm.ai/docs/providers/vertex_self_deployed) it states that I should use following api_base
 api_base: https://ENDPOINT.us-central1-PROJECT.prediction.vertexai.goog/v1/projects/PROJECT_ID/locations/us-central1/endpoints/ENDPOINT_ID:predict

Is ENDPOINT Endpoint ID or dedicated endpoint URL? After the ENDPOINT it really should be GCP location-PROJECT? What is project (Project ID, Project Number?) 

<img width="2113" height="1576" alt="Image" src="https://github.com/user-attachments/assets/ff33459e-8914-427a-bc07-91b5378d2871" />
Could you please clarify what is what in the documentation?


### Steps to Reproduce

1. Deply dedicated VertexAI endpoint.
2. Use Qwen 3.5 model
3. Try to use LiteLLM with VertexAI custom model.


### Relevant log output
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Hi.

I am following version of LiteLLM as Docker.

Version: 1.81.14
Summary: Library to easily interface with LLM API providers
Home-page: https://litellm.ai
Author: BerriAI
Author-email:
License: MIT
Location: /usr/lib/python3.13/site-packages
Requires: aiohttp, click, fastuuid, httpx, importlib-metadata, jinja2, jsonschema, openai, pydantic, python-dotenv, tiktoken, tokenizers
Required-by: semantic-router

I have some local ollama models which are working together with Open WebUI and one endpoint is VertexAI.

Docker Compose

services:
  db:
    image: postgres:16
    container_name: llm-db
    restart: always
    environment:
      POSTGRES_DB: ${POSTGRES_DB}
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - /mnt/AppData/llm-stack/postgres:/var/lib/postgresql/data
      - /etc/localtime:/etc/localtime:ro
    ports:
      - "${DB_PORT}:5432"
    networks:
      - frontend
    logging:
      driver: loki
      options:
        loki-url: "http://127.0.0.1:3100/loki/api/v1/push"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -d ${POSTGRES_DB} -U ${POSTGRES_USER}"]
      interval: 5s
      timeout: 5s
      retries: 5

  litellm:
    image: ghcr.io/berriai/litellm:main-stable
    container_name: llm-litellm
    restart: unless-stopped
    depends_on:
      db:
        condition: service_healthy
    command:
      - "--config=/app/config.yaml"
    ports:
      - "${LITELLM_PORT}:4000"
    environment:
      DATABASE_URL: "postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@llm-db:5432/${POSTGRES_DB}"
      STORE_MODEL_IN_DB: "True"
      LITELLM_MASTER_KEY: ${LLM_MASTER_KEY}
      TZ: ${TZ}
    volumes:
      - /mnt/AppData/llm-stack/litellm/config.yaml:/app/config.yaml:ro
      - /mnt/AppData/llm-stack/litellm/google-key.json:/app/google-key.json:ro
      - /etc/localtime:/etc/localtime:ro
    networks:
      - frontend
    logging:
      driver: loki
      options:
        loki-url: "http://127.0.0.1:3100/loki/api/v1/push"

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: llm-webui
    restart: unless-stopped
    depends_on:
      - litellm
    ports:
      - "${WEBUI_PORT}:8080"
    environment:
      OPENAI_API_BASE_URL: "http://litellm:4000"
      OPENAI_API_KEY: ${LLM_MASTER_KEY}
      ENABLE_OLLAMA_API: "False"
      ENABLE_RAG_WEB_SEARCH: "False"
      TZ: ${TZ}
      OAUTH_CLIENT_ID: ${OAUTH_CLIENT_ID}
      OAUTH_CLIENT_SECRET: ${OAUTH_CLIENT_SECRET}
      OPENID_PROVIDER_URL: ${OPENID_PROVIDER_URL}
      OAUTH_PROVIDER_NAME: ${OAUTH_PROVIDER_NAME}
      OPENID_REDIRECT_URI: ${OPENID_REDIRECT_URI}
      WEBUI_URL: ${WEBUI_URL}
      ENABLE_OAUTH_SIGNUP: ${ENABLE_OAUTH_SIGNUP}
      OAUTH_MERGE_USERS_BY_EMAIL: ${OAUTH_MERGE_USERS_BY_EMAIL}
      WEBUI_NAME: ${WEBUI_NAME}
    volumes:
      - /mnt/AppData/llm-stack/open-webui:/app/backend/data
      - /etc/localtime:/etc/localtime:ro
    networks:
      - frontend
    logging:
      driver: loki
      options:
        loki-url: "http://127.0.0.1:3100/loki/api/v1/push"

  ollama:
    image: ollama/ollama:latest
    container_name: llm-ollama
    restart: unless-stopped
    user: "${PUID}:${PGID}"
    ports:
      - "${OLLAMA_PORT}:11434" # Dostęp lokalny (np. z MacBooka)
    volumes:
      - /home/wojcieh/ollama-storage:/models
      - /mnt/AppData/llm-stack/ollama:/config
      - /etc/localtime:/etc/localtime:ro
    networks:
      - frontend
    environment:
      - TZ=${TZ}
      - OLLAMA_KEEP_ALIVE=24h
      - OLLAMA_MODELS=/models
      - OLLAMA_INTEL_GPU=true
    devices:
      - /dev/dri:/dev/dri
    group_add:
      - "109" 

    logging:
      driver: loki
      options:
        loki-url: "http://127.0.0.1:3100/loki/api/v1/push"
networks:
  frontend:
    external: true

LiteLLM config

model_list:
  - model_name: "Gemini Vertex AI"
    litellm_params:
      model: "vertex_ai/gemini-3-pro-preview"
      vertex_project: "wmarusiak-homelab-ai"
      vertex_location: "global"
      vertex_search_engine_id: "projects/XXXXXXXX/locations/eu/collections/default_collection/dataStores/homelab-ds_XXXXXXXX"
      tools:
        - google_search: {}
        - code_execution: {}
      stream: false
      drop_params: true
      request_timeout: 60
  - model_name: "qwen3.5:9b"
    litellm_params:
      model: ollama/qwen3.5:9b
      api_base: http://llm-ollama:11434
      num_ctx: 4096
  - model_name: "qwen3.5-9b-abliterated"
    litellm_params:
      model: ollama/lukey03/qwen3.5-9b-abliterated:latest
      api_base: http://llm-ollama:11434
      num_ctx: 4096
  - model_name: "Qwen-3.5-Vertex-Final"
    litellm_params:
      model: "vertex_ai/qwen/qwen-35b-dedicated"
      api_base: "https://mg-endpoint-XXX.europe-XX-wmarusiak-homelab-ai.prediction.vertexai.goog/v1/projects/wmarusiak-homelab-ai/locations/europe-west4/endpoints/mg-endpoint-XXX3:predict"

I deployed in VertexAI dedicated endpoint qwen_qwen3_5-35b-a3b-mg-one-click-deploy.

I cannot use my dedicated endpoint.

In your documentation https://docs.litellm.ai/docs/providers/vertex_self_deployed it states that I should use following api_base api_base: https://ENDPOINT.us-central1-PROJECT.prediction.vertexai.goog/v1/projects/PROJECT_ID/locations/us-central1/endpoints/ENDPOINT_ID:predict

Is ENDPOINT Endpoint ID or dedicated endpoint URL? After the ENDPOINT it really should be GCP location-PROJECT? What is project (Project ID, Project Number?)

<img width="2113" height="1576" alt="Image" src="https://github.com/user-attachments/assets/ff33459e-8914-427a-bc07-91b5378d2871" /> Could you please clarify what is what in the documentation?

Steps to Reproduce

  1. Deply dedicated VertexAI endpoint.
  2. Use Qwen 3.5 model
  3. Try to use LiteLLM with VertexAI custom model.

Relevant log output

09:08:58 - LiteLLM Proxy:ERROR: common_request_processing.py:1121 - litellm.proxy.proxy_server._handle_llm_api_exception(): Exception occured - litellm.InternalServerError: Vertex_aiException InternalServerError - Cannot connect to host mg-endpoint-cedf4bc7-2e0e-466c-bdd5-8ab69ef4fa63.europe-west4-wmarusiak-homelab-ai.prediction.vertexai.goog:443 ssl:<ssl.SSLContext object at 0x7bf6f8bb6170> [Name or service not known]. Received Model Group=Qwen-3.5-Vertex-Final
Available Model Group Fallbacks=None LiteLLM Retried: 2 times, LiteLLM Max Retries: 2
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/aiohttp/connector.py", line 1562, in _create_direct_connection
    hosts = await self._resolve_host(host, port, traces=traces)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/aiohttp/connector.py", line 1178, in _resolve_host
    return await asyncio.shield(resolved_host_task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/aiohttp/connector.py", line 1209, in _resolve_host_with_throttle
    addrs = await self._resolver.resolve(host, port, family=self._family)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/aiohttp/resolver.py", line 40, in resolve
    infos = await self._loop.getaddrinfo(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
    )
    ^
  File "uvloop/loop.pyx", line 1529, in getaddrinfo
socket.gaierror: [Errno -2] Name or service not known

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/aiohttp_transport.py", line 61, in map_aiohttp_exceptions
    yield
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/aiohttp_transport.py", line 297, in handle_async_request
    response = await self._make_aiohttp_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<6 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/aiohttp_transport.py", line 275, in _make_aiohttp_request
    response = await client_session.request(**request_kwargs).__aenter__()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/aiohttp/client.py", line 1510, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/aiohttp/client.py", line 779, in _request
    resp = await handler(req)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/aiohttp/client.py", line 734, in _connect_and_send_request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        req, traces=traces, timeout=real_timeout
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/aiohttp/connector.py", line 672, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/aiohttp/connector.py", line 1239, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/aiohttp/connector.py", line 1568, in _create_direct_connection
    raise ClientConnectorDNSError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorDNSError: Cannot connect to host mg-endpoint-cedf4bc7-2e0e-466c-bdd5-8ab69ef4fa63.europe-west4-wmarusiak-homelab-ai.prediction.vertexai.goog:443 ssl:<ssl.SSLContext object at 0x7bf6f8bb6170> [Name or service not known]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 175, in _make_common_async_call
    response = await async_httpx_client.post(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/logging_utils.py", line 297, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 474, in post
    return await self.single_connection_post_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<7 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 721, in single_connection_post_request
    response = await client.send(req, stream=stream)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/httpx/_client.py", line 1629, in send
    response = await self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/httpx/_client.py", line 1657, in _send_handling_auth
    response = await self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<3 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/httpx/_client.py", line 1694, in _send_handling_redirects
    response = await self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/httpx/_client.py", line 1730, in _send_single_request
    response = await transport.handle_async_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/aiohttp_transport.py", line 296, in handle_async_request
    with map_aiohttp_exceptions():
         ~~~~~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib/python3.13/contextlib.py", line 162, in __exit__
    self.gen.throw(value)
    ~~~~~~~~~~~~~~^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/aiohttp_transport.py", line 75, in map_aiohttp_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: Cannot connect to host mg-endpoint-cedf4bc7-2e0e-466c-bdd5-8ab69ef4fa63.europe-west4-wmarusiak-homelab-ai.prediction.vertexai.goog:443 ssl:<ssl.SSLContext object at 0x7bf6f8bb6170> [Name or service not known]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 612, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 691, in acompletion_stream_function
    completion_stream, _response_headers = await self.make_async_call_stream_helper(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<15 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 750, in make_async_call_stream_helper
    response = await self._make_common_async_call(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 202, in _make_common_async_call
    raise self._handle_error(e=e, provider_config=provider_config)
          ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 4642, in _handle_error
    raise provider_config.get_error_class(
    ...<3 lines>...
    )
litellm.llms.openai.common_utils.OpenAIError: Cannot connect to host mg-endpoint-cedf4bc7-2e0e-466c-bdd5-8ab69ef4fa63.europe-west4-wmarusiak-homelab-ai.prediction.vertexai.goog:443 ssl:<ssl.SSLContext object at 0x7bf6f8bb6170> [Name or service not known]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 6264, in chat_completion
    result = await base_llm_response_processor.base_process_llm_request(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<16 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/proxy/common_request_processing.py", line 856, in base_process_llm_request
    responses = await llm_responses
                ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1533, in acompletion
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1509, in acompletion
    response = await self.async_function_with_fallbacks(**kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5104, in async_function_with_fallbacks
    return await self.async_function_with_fallbacks_common_utils(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<8 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5061, in async_function_with_fallbacks_common_utils
    raise original_exception
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5095, in async_function_with_fallbacks
    response = await self.async_function_with_retries(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5336, in async_function_with_retries
    raise original_exception
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5201, in async_function_with_retries
    response = await self.make_call(original_function, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5347, in make_call
    response = await response
               ^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1891, in _acompletion
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1837, in _acompletion
    response = await _response
               ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 2041, in wrapper_async
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1862, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 631, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2398, in exception_type
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 1455, in exception_type
    raise litellm.InternalServerError(
    ...<9 lines>...
    )
litellm.exceptions.InternalServerError: litellm.InternalServerError: Vertex_aiException InternalServerError - Cannot connect to host mg-endpoint-cedf4bc7-2e0e-466c-bdd5-8ab69ef4fa63.europe-west4-wmarusiak-homelab-ai.prediction.vertexai.goog:443 ssl:<ssl.SSLContext object at 0x7bf6f8bb6170> [Name or service not known]. Received Model Group=Qwen-3.5-Vertex-Final
Available Model Group Fallbacks=None LiteLLM Retried: 2 times, LiteLLM Max Retries: 2
INFO:     172.26.0.26:38618 - "POST /chat/completions HTTP/1.1" 500 Internal Server Error

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

1.81.14

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To resolve the issue with connecting to the Vertex AI dedicated endpoint, you need to update the api_base URL in your LiteLLM configuration.

  1. Identify the correct endpoint URL: The endpoint URL should be in the format https://<ENDPOINT_ID>-<LOCATION>-<PROJECT_ID>.prediction.vertexai.goog/v1/projects/<PROJECT_ID>/locations/<LOCATION>/endpoints/<ENDPOINT_ID>:predict.
  2. Replace <ENDPOINT_ID>: Use the actual ID of your dedicated endpoint.
  3. Replace <LOCATION>: Use the location where your endpoint is deployed (e.g., europe-west4).
  4. Replace <PROJECT_ID>: Use your actual GCP project ID.

Example:

model_list:
  - model_name: "Qwen-3.5-Vertex-Final"
    litellm_params:
      model: "vertex_ai/qwen/qwen-35b-dedicated"
      api_base: "https://mg-endpoint-XXX.europe-west4-XXXXXXX.prediction.vertexai.goog/v1/projects/XXXXXXX/locations/europe-west4/endpoints/mg-endpoint-XXX:predict"

Make sure to update the api_base URL with the correct values for your endpoint.

Verification

After updating the configuration, restart your LiteLLM service and try to use the dedicated endpoint again. You can verify that the fix worked by checking the LiteLLM logs for successful connections to the Vertex AI endpoint.

Extra Tips

  • Ensure that your GCP project ID and endpoint ID are correct.
  • Double-check the location of your endpoint and update the api_base URL accordingly.
  • If you're still experiencing issues, try checking the Vertex AI documentation for any specific requirements or restrictions on using dedicated endpoints.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [Bug]: VertexAI - unable to use 3rd party model - Qwen 3.5 with dedicated endpoint [1 participants]