litellm - 💡(How to fix) Fix [Bug]: Vertex Gemini traffic_type is only stored in _hidden_params and not exposed in normalized proxy response or headers [1 participants]

Code Example

model_list:
    - model_name: gemini-25-flash-lite-dedicated
      litellm_params:
        model: vertex_ai/gemini-2.5-flash-lite
        vertex_project: os.environ/VERTEXAI_PROJECT
        vertex_location: os.environ/VERTEXAI_LOCATION
        vertex_credentials: os.environ/GOOGLE_APPLICATION_CREDENTIALS
        extra_headers:
          X-Vertex-AI-LLM-Request-Type: dedicated
          custom-llm-provider: vertex_ai

  litellm_settings:
    return_response_headers: true

---

model_list:
    - model_name: gemini-25-flash-lite-dedicated
      litellm_params:
        model: vertex_ai/gemini-2.5-flash-lite
        vertex_project: os.environ/VERTEXAI_PROJECT
        vertex_location: os.environ/VERTEXAI_LOCATION
        vertex_credentials: os.environ/GOOGLE_APPLICATION_CREDENTIALS
        extra_headers:
          X-Vertex-AI-LLM-Request-Type: dedicated
          custom-llm-provider: vertex_ai

  litellm_settings:
    return_response_headers: true

---

=== Request ===
  URL: https://aiplatform.googleapis.com/v1/projects/<redacted-project>/locations/global/publishers/google/models/gemini-2.5-flash-lite:generateContent
  X-Vertex-AI-LLM-Request-Type: dedicated

  === Response Status ===
  HTTP 200

  === Response Headers ===
  HTTP/2 200
  x-vertex-ai-llm-request-type: dedicated
  content-type: application/json; charset=UTF-8
  vary: X-Origin
  vary: Referer
  vary: Origin,Accept-Encoding
  date: <redacted-timestamp>
  server: scaffolding on HTTPServer2
  x-xss-protection: 0
  x-frame-options: SAMEORIGIN
  x-content-type-options: nosniff
  alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
  accept-ranges: none

  === Response Body ===
  {
    "candidates": [
      {
        "content": {
          "role": "model",
          "parts": [
            {
              "text": "I am a large language model, trained by Google."
            }
          ]
        },
        "finishReason": "STOP",
        "avgLogprobs": -0.00013389628888531163
      }
    ],
    "usageMetadata": {
      "promptTokenCount": 12,
      "candidatesTokenCount": 11,
      "totalTokenCount": 23,
      "trafficType": "PROVISIONED_THROUGHPUT",
      "promptTokensDetails": [
        {
          "modality": "TEXT",
          "tokenCount": 12
        }
      ],
      "candidatesTokensDetails": [
        {
          "modality": "TEXT",
          "tokenCount": 11
        }
      ]
    },
    "modelVersion": "gemini-2.5-flash-lite",
    "createTime": "<redacted-timestamp>",
    "responseId": "<redacted-response-id>"
  }

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When calling Vertex AI Gemini models through LiteLLM Proxy, LiteLLM appears to know the Vertex traffic type internally, but does not expose it consistently in the public/normalized response.

For my Gemini alias, the proxy config is:

model_list:
  - model_name: gemini-25-flash-lite-dedicated
    litellm_params:
      model: vertex_ai/gemini-2.5-flash-lite
      vertex_project: os.environ/VERTEXAI_PROJECT
      vertex_location: os.environ/VERTEXAI_LOCATION
      vertex_credentials: os.environ/GOOGLE_APPLICATION_CREDENTIALS
      extra_headers:
        X-Vertex-AI-LLM-Request-Type: dedicated
        custom-llm-provider: vertex_ai

litellm_settings:
  return_response_headers: true

Observed behavior:

LiteLLM sends the request successfully to Vertex AI
LiteLLM internally records traffic_type
but for Gemini, the value is only visible in _hidden_params.provider_specific_fields.traffic_type
the normalized response body does not include usage.extra_properties.google.traffic_type
the proxy response headers also do not expose a stable traffic type header, even with return_response_headers: true

I had to add a custom callback to:

read _hidden_params.provider_specific_fields.traffic_type
backfill usage.extra_properties.google.traffic_type

Expected behavior: LiteLLM should expose Vertex traffic_type consistently in the public normalized response for Vertex Gemini, the same way it already does for some other Vertex model routes like Vertex OpenAI/Qwen.

In other words, LiteLLM already seems to have the data, but it is not surfacing it consistently.

Steps to Reproduce

Start LiteLLM Proxy with a Vertex Gemini alias like this:

model_list:
  - model_name: gemini-25-flash-lite-dedicated
    litellm_params:
      model: vertex_ai/gemini-2.5-flash-lite
      vertex_project: os.environ/VERTEXAI_PROJECT
      vertex_location: os.environ/VERTEXAI_LOCATION
      vertex_credentials: os.environ/GOOGLE_APPLICATION_CREDENTIALS
      extra_headers:
        X-Vertex-AI-LLM-Request-Type: dedicated
        custom-llm-provider: vertex_ai

litellm_settings:
  return_response_headers: true

Send a request through LiteLLM Proxy:

curl -sS
-D /tmp/headers.txt
-o /tmp/body.json
-X POST http://localhost:4000/v1/chat/completions
-H "Authorization: Bearer sk-local-litellm-dev"
-H "Content-Type: application/json"
--data '{"model":"gemini-25-flash-lite-dedicated","messages":[{"role":"user","content":"Hello, what model are you? Answer in one sentence."}],"temperature":0.6,"max_tokens":256}'

Inspect the headers and body:

cat /tmp/headers.txt jq '.usage.extra_properties.google.traffic_type // .usageMetadata.trafficType // "not_found"' /tmp/body.json

Observe that:

the request succeeds
the response body does not contain usage.extra_properties.google.traffic_type
the proxy headers do not contain a stable traffic type header

Compare with Vertex OpenAI/Qwen routes, where LiteLLM does expose: usage.extra_properties.google.traffic_type

This also reproduces for my -no-custom-provider Gemini alias, so it does not seem specific to the custom-llm-provider header.

Relevant log output

=== Request ===
  URL: https://aiplatform.googleapis.com/v1/projects/<redacted-project>/locations/global/publishers/google/models/gemini-2.5-flash-lite:generateContent
  X-Vertex-AI-LLM-Request-Type: dedicated

  === Response Status ===
  HTTP 200

  === Response Headers ===
  HTTP/2 200
  x-vertex-ai-llm-request-type: dedicated
  content-type: application/json; charset=UTF-8
  vary: X-Origin
  vary: Referer
  vary: Origin,Accept-Encoding
  date: <redacted-timestamp>
  server: scaffolding on HTTPServer2
  x-xss-protection: 0
  x-frame-options: SAMEORIGIN
  x-content-type-options: nosniff
  alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
  accept-ranges: none

  === Response Body ===
  {
    "candidates": [
      {
        "content": {
          "role": "model",
          "parts": [
            {
              "text": "I am a large language model, trained by Google."
            }
          ]
        },
        "finishReason": "STOP",
        "avgLogprobs": -0.00013389628888531163
      }
    ],
    "usageMetadata": {
      "promptTokenCount": 12,
      "candidatesTokenCount": 11,
      "totalTokenCount": 23,
      "trafficType": "PROVISIONED_THROUGHPUT",
      "promptTokensDetails": [
        {
          "modality": "TEXT",
          "tokenCount": 12
        }
      ],
      "candidatesTokensDetails": [
        {
          "modality": "TEXT",
          "tokenCount": 11
        }
      ]
    },
    "modelVersion": "gemini-2.5-flash-lite",
    "createTime": "<redacted-timestamp>",
    "responseId": "<redacted-response-id>"
  }

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.3

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The issue can be resolved by modifying the LiteLLM Proxy configuration to consistently expose the Vertex traffic type in the public normalized response for Vertex Gemini models.

Guidance

Verify that the return_response_headers setting is enabled in the LiteLLM Proxy configuration to ensure that response headers are included in the output.
Check the LiteLLM Proxy logs to confirm that the traffic_type is being recorded internally, but not exposed in the normalized response.
Consider adding a custom callback to backfill the usage.extra_properties.google.traffic_type field, similar to the workaround already implemented.
Review the LiteLLM Proxy documentation to see if there are any configuration options or settings that can be adjusted to expose the traffic_type consistently for Vertex Gemini models.

Example

No code snippet is provided as the issue is related to configuration and not a specific code implementation.

Notes

The issue seems to be specific to Vertex Gemini models and not a general problem with LiteLLM Proxy. The workaround of adding a custom callback to backfill the usage.extra_properties.google.traffic_type field may be necessary until a more permanent solution is found.

Recommendation

Apply workaround: Add a custom callback to backfill the usage.extra_properties.google.traffic_type field, as this is the most straightforward solution to expose the traffic type consistently for Vertex Gemini models.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix [Bug]: Vertex Gemini traffic_type is only stored in _hidden_params and not exposed in normalized proxy response or headers [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix [Bug]: Vertex Gemini traffic_type is only stored in _hidden_params and not exposed in normalized proxy response or headers [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

Check for existing issues

What happened?

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING