litellm - ✅(Solved) Fix [Bug]: Google-native :generateContent route silently drops correlation metadata (call_id, tags, user) from spend logs [3 pull requests, 1 participants]

litellm2026-04-17 14:35:52

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#25956•Fetched 2026-04-18 05:52:52

View on GitHub

Comments

Participants

Timeline

Reactions

Author

dkindlund

Participants

dkindlund

Timeline (top)

cross-referenced ×3labeled ×1mentioned ×1subscribed ×1

The Google-native /v1beta/models/{model}:generateContent and /v1beta/models/{model}:streamGenerateContent routes silently drop all client-provided correlation metadata before it reaches LiteLLM_SpendLogs. Spend records are written with the correct token counts, cost, model, and timestamps — but every queryable correlation field is empty:

Client sends	Spend log field	Behavior
`x-litellm-call-id: <uuid>` header	`request_id`	Silently overridden by Gemini's `response.id`
`x-litellm-tags: scan_id=<uuid>` header	`request_tags`	Silently dropped (`[]`)
`user: <uuid>` in request body	`user` / `end_user`	Silently dropped (`""`)

Because none of these survive, callers cannot filter spend logs to attribute spend back to a specific scan, run, agent invocation, or end user. This blocks any per-call cost-attribution use case downstream of the proxy.

The same call patterns work correctly on /v1/chat/completions (OpenAI-compat) and /v1/messages (Anthropic-compat) on the same proxy — so this is a Google-native-specific gap, not a global misconfiguration.

Root Cause

Fix Action

Fix / Workaround

Workaround until fixes land

PR #25500 — fixes defect #1 (outbound response headers)
Issue #24097 — documents defect #2 (streaming callbacks)
PR #25960 — fixes defect #2; supersedes #24114, closes #24097
PR #24114 — earlier fix attempt for #24097 by @awais786; closed by author same day, never merged. Superseded by #25960.
Issue #24945 — adjacent: litellm_metadata overridden on /v1/messages (same family of "non-OpenAI route loses metadata")
PR #24964 — earlier version of #25500 merged into a now-stale staging branch; superseded
PR #25955 — fixes defect #3 (this issue)
PR #25952 — fixes defect #4 (this issue)

PR fix notes

PR #25952: fix(proxy): honor client-supplied x-litellm-call-id in spend log request_id

Repository: BerriAI/litellm
Author: dkindlund
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/25952

Description (problem / solution / changelog)

Summary

When a client sends x-litellm-call-id: <my-uuid> explicitly, today the proxy silently discards it from the spend log: the spend record exists, but with an unrelated provider response id as request_id, so /spend/logs/v2?request_id=<my-uuid> returns empty.

Root cause

get_spend_logs_id (litellm/proxy/spend_tracking/spend_tracking_utils.py:166) does:

id = response_obj.get("id") or kwargs.get("litellm_call_id")

response_obj.id is the provider's own response id (e.g. Gemini's "QanhadapC_val7oP15PyuQM", OpenAI's "chatcmpl-...", Anthropic's "msg_..."). It's always present, so it always wins — litellm_call_id is never used to populate request_id in the spend log.

This is most painful on the Google-native :generateContent route, where the response also lacks x-litellm-* headers (see #25500), so clients can't even read back the resolved id from the response. But the underlying defect is cross-route.

Fix

Track whether litellm_call_id was supplied by the client via x-litellm-call-id (vs. an auto-generated UUID). When the client supplied it, prefer it over response.id for the spend log request_id. When the call_id is auto-generated, fall back to existing behavior (response.id first, then the auto-uuid).

This preserves backward compatibility for callers that don't set the header (existing tests at tests/test_keys.py:519, tests/test_spend_logs.py:115/181, etc., still assert request_id == response["id"]).

Files changed

litellm/proxy/common_request_processing.py — set data["litellm_call_id_from_client"] = True when the header is present (covers /v1/chat/completions, /v1/messages, etc.)
litellm/proxy/google_endpoints/endpoints.py — same flag for both :generateContent and :streamGenerateContent
litellm/proxy/spend_tracking/spend_tracking_utils.py — get_spend_logs_id honors the new flag
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py — 6 new tests (TestGetSpendLogsId) covering both default and flagged precedence, plus the aretrieve_batch carve-out

Manual testing

# Without the fix
curl -X POST "https://<proxy>/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "x-goog-api-key: sk-..." \
  -H "x-litellm-call-id: my-uuid-123" \
  -d '{"contents": [{"parts": [{"text": "hi"}], "role": "user"}]}'

curl "https://<proxy>/spend/logs/v2?request_id=my-uuid-123"
# → data: []  (BAD)

# With the fix
# → data: [{"request_id": "my-uuid-123", ...}]  (GOOD)

Tests

tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py::TestGetSpendLogsId::test_default_prefers_response_id PASSED
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py::TestGetSpendLogsId::test_default_falls_back_to_litellm_call_id PASSED
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py::TestGetSpendLogsId::test_client_supplied_flag_prefers_litellm_call_id PASSED
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py::TestGetSpendLogsId::test_client_supplied_flag_falls_back_to_response_id PASSED
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py::TestGetSpendLogsId::test_client_supplied_flag_false_uses_default_precedence PASSED
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py::TestGetSpendLogsId::test_aretrieve_batch_unchanged PASSED

Test plan:

Add unit tests for new behavior
Verify existing tests asserting request_id == response["id"] are unaffected (default path unchanged)
Manual end-to-end on Google-native :generateContent route

Part of a larger cluster of bugs in the Google-native /v1beta/... route (separate detailed issue forthcoming). Companion fixes:

#25500 (in flight) — outbound x-litellm-* response headers on Google-native routes
#24097 — streaming success_callback silently skipped on Google-native streaming
Forthcoming PR — metadata / user propagation into spend logs on Google-native non-streaming

Tracking issue

#25956 — full root-cause writeup for the Google-native correlation cluster (this PR is Fix #1 of 4 defects)

Changed files

litellm/proxy/common_request_processing.py (modified, +10/-3)
litellm/proxy/google_endpoints/endpoints.py (modified, +20/-6)
litellm/proxy/spend_tracking/spend_tracking_utils.py (modified, +6/-0)
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py (modified, +93/-0)

PR #25955: fix(google_genai): propagate user from kwargs to logging obj in agenerate_content

Repository: BerriAI/litellm
Author: dkindlund
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/25955

Description (problem / solution / changelog)

Summary

The Google-native /v1beta/models/{model}:generateContent and /v1beta/models/{model}:streamGenerateContent routes silently drop the user field from the spend log. Spend records are written with the correct cost and tokens, but user: "" and end_user: "", blocking per-end-user attribution on this route.

Root cause

The Google-native handler in litellm/proxy/google_endpoints/endpoints.py reads user from the request body via add_litellm_data_to_request, which correctly populates data["user"]. That value flows through function_setup and then through llm_router.agenerate_content(**data) into litellm.agenerate_content, where kwargs["user"] is set.

But in setup_generate_content_call (litellm/google_genai/main.py:185), the call to litellm_logging_obj.update_from_kwargs doesn't pass user:

litellm_logging_obj.update_from_kwargs(
    kwargs=kwargs,
    model=model,
    optional_params=dict(generate_content_config_dict),
    litellm_params={
        "litellm_call_id": litellm_call_id,
    },
    custom_llm_provider=custom_llm_provider,
)

Without user=..., update_from_kwargs defers to its default user=None, so logging_obj.user and model_call_details["user"] are None, and the spend log row shows an empty user field.

OpenAI-compat /v1/chat/completions works because litellm.completion calls logging.update_environment_variables(user=user, ...) with the resolved user value (litellm/main.py:1597, 4806, 6325, 6496, 6746).

Fix

Pass user=kwargs.get("user") to update_from_kwargs so the logging object reflects what the client sent.

Files changed

litellm/google_genai/main.py — pass user through (one-line change)
tests/test_litellm/google_genai/test_google_genai_main.py — regression test using a real Logging instance with stubbed update_from_kwargs to verify user is propagated

Tests

Added test_setup_generate_content_call_propagates_user_to_logging_obj which:

Constructs a real Logging instance (required by Pydantic on GenerateContentSetupResult)
Stubs update_from_kwargs
Calls setup_generate_content_call with user="my-end-user-uuid-456"
Asserts update_from_kwargs.call_args.kwargs["user"] == "my-end-user-uuid-456"

Verified the test fails on main and passes with this commit.

tests/test_litellm/google_genai/test_google_genai_main.py::test_setup_generate_content_call_propagates_user_to_logging_obj PASSED

Manual testing

# Without the fix
curl -X POST "https://<proxy>/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "x-goog-api-key: sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "hi"}], "role": "user"}],
    "user": "my-end-user-uuid-456"
  }'

curl "https://<proxy>/spend/logs/v2?api_key=<hashed>"
# → {... "user": "", "end_user": ""}  (BAD)

# With the fix
# → {... "user": "my-end-user-uuid-456", "end_user": "my-end-user-uuid-456"}  (GOOD)

Part of a larger cluster of bugs in the Google-native /v1beta/... route (separate detailed issue forthcoming). Companion fixes:

#25500 (in flight) — outbound x-litellm-* response headers on Google-native routes
#25952 — x-litellm-call-id precedence in spend log request_id
#24097 — streaming success_callback silently skipped on Google-native streaming

Note: tags propagation

The developer's repro report also noted request_tags: [] (empty) when sending x-litellm-tags. From code inspection, add_litellm_data_to_request and update_from_kwargs should propagate metadata["tags"] correctly through the agenerate_content path, so tags may already work — or there may be a separate downstream gap in how StandardLoggingPayload._get_request_tags reads from litellm_params.metadata for the agenerate_content call_type. That investigation is out of scope for this PR. The user fix here is independent and unblocks per-end-user attribution today; tags can be addressed in a follow-up.

Tracking issue

#25956 — full root-cause writeup for the Google-native correlation cluster (this PR is Fix #3 of 4 defects)

Changed files

litellm/google_genai/main.py (modified, +1/-0)
tests/test_litellm/google_genai/test_google_genai_main.py (modified, +65/-8)

PR #25960: fix(google_genai): route streaming chunks to GeminiPassthroughLoggingHandler so success_callbacks fire (closes #24097, supersedes #24114)

Repository: BerriAI/litellm
Author: dkindlund
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/25960

Description (problem / solution / changelog)

Summary

Fixes #24097.

Supersedes #24114 (closed by author without merging on 2026-03-19). This revival preserves the structural fix and incorporates Greptile's review feedback that was outstanding when the original was closed.

What was broken

For the Google-native streaming endpoints /models/{model}:streamGenerateContent and /v1beta/models/{model}:streamGenerateContent, the streaming iterator was tagging collected chunks as EndpointType.VERTEX_AI. The downstream _route_streaming_logging_to_handler had no VERTEX_AI branch that knew how to parse Google GenAI native chunks, so async_complete_streaming_response was never set and every function-based and CustomLogger success callback was silently skipped on stream end.

Sync callers were doubly broken: __next__ re-raised StopIteration without ever invoking the logging route at all.

The fix

Add EndpointType.GOOGLE_GENAI = "google-genai" to the enum.
In the async iterator, tag chunks as GOOGLE_GENAI, pass the real /models/{model}:streamGenerateContent URL, and forward the explicit model kwarg so downstream handlers do not have to fall back to URL parsing.
In the sync iterator, mirror the async path on StopIteration so sync callers also receive callbacks. (This was the gap PR #24114 left open.)
Add a GOOGLE_GENAI routing branch in _route_streaming_logging_to_handler that dispatches to GeminiPassthroughLoggingHandler._handle_logging_gemini_collected_chunks.

Tests

tests/test_litellm/proxy/pass_through_endpoints/llm_provider_handlers/test_google_genai_streaming_callbacks.py adds 7 regression tests, all runtime-behavior based:

Test	What it covers
`TestEndpointTypeEnum::test_google_genai_member_exists`	Enum has `GOOGLE_GENAI = "google-genai"`
`TestEndpointTypeEnum::test_existing_endpoint_types_preserved`	Sanity guard for VERTEX_AI/ANTHROPIC/OPENAI/GENERIC
`test_async_iterator_routes_with_google_genai_endpoint_type`	Async iterator routes with right enum + URL + model
`test_sync_iterator_routes_with_google_genai_endpoint_type`	New — sync iterator does the same (closes #24114 gap)
`test_streaming_handler_routes_google_genai_to_gemini_handler`	Handler dispatches `GOOGLE_GENAI` to Gemini handler with `model` kwarg
`test_streaming_handler_does_not_route_vertex_ai_to_gemini_handler`	Regression guard — VERTEX_AI still goes to Vertex handler
`test_callbacks_actually_fire_for_google_genai_endpoint`	End-to-end — uses real `Logging` instance and real `CustomLogger` subclass; verifies `async_log_success_event` actually fires (not just that the right code path was taken)

Verified locally: 6 of 7 fail on main without this commit, all 7 pass with it.

============================== 7 passed in 0.54s ===============================

Differences from PR #24114 (Greptile feedback addressed)

Greptile concern	How this revival addresses it
P0: pre-setting `async_complete_streaming_response` triggered early-return guard	Not pre-set; relies on `async_success_handler`'s own pass-through branch to set it.
P1: end-to-end test mocked `async_success_handler`, hiding the regression	New `test_callbacks_actually_fire_for_google_genai_endpoint` uses a real `Logging` instance and a real `CustomLogger` subclass — actually exercises the dispatch.
P2: dead-code assertion message tuple `mock.assert_called_once(), ("msg")`	Replaced with positional `assert mock.call_count == 1, "msg"`.
P2: `inspect.getsource` would falsely fail on a comment containing `VERTEX_AI`	Replaced with runtime-behavior tests that drive iteration end-to-end and inspect captured kwargs.
P2: `sys.path.insert` working-directory hack	Removed. The standard pytest config makes `litellm` importable.
P2: `raw_bytes` type mismatch (`bytes` vs `List[bytes]`)	Test stubs use `List[bytes]` matching the real signature.
P2: missing `model` kwarg in iterator → handler call	Iterator now passes `model=self.model` explicitly; handler test asserts forwarding.
P2: hardcoded `url_route="/v1/generateContent"` did not match `/models/{model}:streamGenerateContent`	Iterator now uses `f"/models/{self.model}:streamGenerateContent"`.
Functional gap (not raised in #24114, but real): sync iterator never invoked logging on `StopIteration`	Fixed. New `_handle_sync_streaming_logging` mirrors the async path.

Also dropped unrelated formatting-only changes that were carried in the original PR (team_endpoints.py, litellm_logging.py) to keep this PR's scope focused.

Files changed

litellm/types/passthrough_endpoints/pass_through_endpoints.py (+1 enum value)
litellm/google_genai/streaming_iterator.py (refactored async logging into _build_logging_kwargs; new sync logging path; switched to GOOGLE_GENAI)
litellm/proxy/pass_through_endpoints/streaming_handler.py (added GOOGLE_GENAI routing branch + import)
tests/test_litellm/proxy/pass_through_endpoints/llm_provider_handlers/test_google_genai_streaming_callbacks.py (new file, 7 tests)

Credits

Original structural approach by @awais786 in #24114. Sync iterator fix, test rewrite, and end-to-end callback-fires test added in this revival.

Tracking

Part of the Google-native correlation cluster — see #25956 for the unified writeup. Companion PRs:

#25500 — outbound x-litellm-* response headers (in flight)
#25952 — x-litellm-call-id precedence in spend log request_id (this issue's #25956 Fix #1)
#25955 — user propagation in agenerate_content logging (this issue's #25956 Fix #3)

Changed files

litellm/google_genai/streaming_iterator.py (modified, +58/-12)
litellm/proxy/pass_through_endpoints/streaming_handler.py (modified, +20/-0)
litellm/types/passthrough_endpoints/pass_through_endpoints.py (modified, +1/-0)
tests/test_litellm/proxy/pass_through_endpoints/llm_provider_handlers/test_google_genai_streaming_callbacks.py (added, +397/-0)

Code Example

# Send a Google-native call with full correlation metadata
curl -X POST "https://<proxy>/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "x-goog-api-key: sk-..." \
  -H "x-litellm-call-id: my-correlation-uuid-123" \
  -H "x-litellm-tags: scan_id=abc,run=42" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "hello"}], "role": "user"}],
    "user": "my-user-uuid-456"
  }'

# Try to find the spend record by any correlation field
curl "https://<proxy>/spend/logs/v2?request_id=my-correlation-uuid-123"
# → data: []  (empty)

# Find it by api_key + time window — record exists
curl "https://<proxy>/spend/logs/v2?api_key=<hashed>&start_date=...&end_date=..."
# → record found, but with:
#   request_id: "QanhadapC_val7oP15PyuQM"   ← Gemini's auto-id, not ours
#   request_tags: []                          ← dropped
#   user: ""                                  ← dropped
#   end_user: ""                              ← dropped

---

┌──────────────────────────────────────────────┐
                 │ google_endpoints/endpoints.py bypasses        │
                 │ base_process_llm_request                      │
                 └───────────────┬──────────────────────────────┘
                                 │
        ┌────────────────────────┼────────────────────────┬───────────────────────┐
        ▼                        ▼                        ▼                       ▼
 outbound headers         streaming callbacks       inbound metadata→        call_id precedence
 (x-litellm-*)            silently skipped          spend log dropped        in spend log
                                                    (tags/user/etc.)         (Gemini id wins)
        │                        │                        │                       │
        ▼                        ▼                        ▼                       ▼
   PR #25500                PR #25960                   PR #25955          PR #25952
   (open, ready             (open, supersedes           (open)             (open)
   for re-review)           closed #24114,
                            closes #24097)

---

id = response_obj.get("id") or kwargs.get("litellm_call_id")

---

GET /spend/logs/v2?api_key=<hashed-per-scan-key>&start_date=<scan_start>&end_date=<scan_end>

RAW_BUFFERClick to expand / collapse

Summary

Client sends	Spend log field	Behavior
`x-litellm-call-id: <uuid>` header	`request_id`	Silently overridden by Gemini's `response.id`
`x-litellm-tags: scan_id=<uuid>` header	`request_tags`	Silently dropped (`[]`)
`user: <uuid>` in request body	`user` / `end_user`	Silently dropped (`""`)

Affected version

Reproduced on v1.83.3. Code inspection confirms the bugs exist in main as of this filing.

Reproduction

# Send a Google-native call with full correlation metadata
curl -X POST "https://<proxy>/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "x-goog-api-key: sk-..." \
  -H "x-litellm-call-id: my-correlation-uuid-123" \
  -H "x-litellm-tags: scan_id=abc,run=42" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "hello"}], "role": "user"}],
    "user": "my-user-uuid-456"
  }'

# Try to find the spend record by any correlation field
curl "https://<proxy>/spend/logs/v2?request_id=my-correlation-uuid-123"
# → data: []  (empty)

# Find it by api_key + time window — record exists
curl "https://<proxy>/spend/logs/v2?api_key=<hashed>&start_date=...&end_date=..."
# → record found, but with:
#   request_id: "QanhadapC_val7oP15PyuQM"   ← Gemini's auto-id, not ours
#   request_tags: []                          ← dropped
#   user: ""                                  ← dropped
#   end_user: ""                              ← dropped

Root cause: Google-native bypass of standard wiring

All three failures share one root cause: the Google-native handlers in litellm/proxy/google_endpoints/endpoints.py bypass the base_process_llm_request codepath that every other proxy route uses. That codepath is where LiteLLM normally:

Wires response headers (x-litellm-*) so clients can read back what call_id was used
Sets up streaming logging hooks (async_complete_streaming_response)
Builds the StandardLoggingPayload from data["metadata"] / data["user"]
Resolves the spend log request_id from litellm_call_id (with response.id as fallback)

Because the Google-native handler does its own thing, each of those wirings has to be re-implemented manually — and the original implementation is incomplete.

                 ┌──────────────────────────────────────────────┐
                 │ google_endpoints/endpoints.py bypasses        │
                 │ base_process_llm_request                      │
                 └───────────────┬──────────────────────────────┘
                                 │
        ┌────────────────────────┼────────────────────────┬───────────────────────┐
        ▼                        ▼                        ▼                       ▼
 outbound headers         streaming callbacks       inbound metadata→        call_id precedence
 (x-litellm-*)            silently skipped          spend log dropped        in spend log
                                                    (tags/user/etc.)         (Gemini id wins)
        │                        │                        │                       │
        ▼                        ▼                        ▼                       ▼
   PR #25500                PR #25960                   PR #25955          PR #25952
   (open, ready             (open, supersedes           (open)             (open)
   for re-review)           closed #24114,
                            closes #24097)

The four related defects

1. Outbound `x-litellm-*` response headers missing — TRACKED

Already addressed by PR #25500 ("feat(proxy): LiteLLM headers on Google native generateContent routes"). Adds build_litellm_proxy_success_headers_from_llm_response and wires it into both streaming and non-streaming Google-native routes. Greptile rated it 5/5; just awaiting maintainer re-review.

2. Streaming success callbacks silently skipped — TRACKED (Fix #2)

Documented as Issue #24097 with a thorough root-cause writeup: model_call_details["async_complete_streaming_response"] is never set on the Google-native streaming path, so function-based success_callback entries are silently skipped inside async_success_handler.

An initial fix attempt was made in PR #24114 ("fix(google-genai): route streaming chunks to GeminiPassthroughLoggingHandler") by @awais786 — closed by the author the same day (2026-03-19) without merging, with several Greptile review concerns outstanding.

Fix proposed in PR #25960 — supersedes #24114 with the structural approach preserved (EndpointType.GOOGLE_GENAI enum, streaming_iterator routing change, GOOGLE_GENAI branch in _route_streaming_logging_to_handler) and incorporates Greptile's outstanding feedback:

Removed inspect.getsource source-text tests in favor of runtime-behavior tests
Removed sys.path manipulation hack
Fixed dead-code assertion-message-as-tuple pattern
Iterator now passes model explicitly + uses real streamGenerateContent URL
Added end-to-end test using a real Logging instance + real CustomLogger subclass (the original mocked async_success_handler itself, masking the regression)
Plus a functional gap PR #24114 left open: sync iterator's __next__ never invoked the logging route on StopIteration, so sync callers were silently broken even with the async-side fix. PR #25960 mirrors the async path on the sync side.

3. Inbound `metadata` / `user` / `tags` not propagated to spend log — NEW (Fix #3)

add_litellm_data_to_request correctly populates data["metadata"]["tags"] (from x-litellm-tags header) and data["user"] (from request body) — verified by reading the function. But these fields don't survive into the StandardLoggingPayload that the spend-log writer consumes. The agenerate_content codepath re-parses kwargs through GenericLiteLLMParams and routes through a different logging path than acompletion, where the propagation never happens.

Fix proposed in PR #25955.

4. `litellm_call_id` overridden by `response_obj.id` in spend log — NEW (Fix #1)

google_endpoints/endpoints.py:61-63 correctly reads x-litellm-call-id into data["litellm_call_id"]. But get_spend_logs_id in litellm/proxy/spend_tracking/spend_tracking_utils.py:172-174 does:

id = response_obj.get("id") or kwargs.get("litellm_call_id")

Gemini's response always includes its own id field, so it always wins — the client's call_id is silently discarded. This is technically a cross-route bug (any provider that returns its own id would override a client-supplied call_id), but it surfaces most painfully on Google-native because Fix #1 from PR #25500 isn't yet merged (clients can't read the call_id back from the response either, so they can't recover it).

Fix proposed in PR #25952.

Workaround until fixes land

If the proxy operator can dedicate one virtual API key per scan/worker/run, then api_key + time_window becomes a reliable correlation pair on the Google-native route:

GET /spend/logs/v2?api_key=<hashed-per-scan-key>&start_date=<scan_start>&end_date=<scan_end>

This is the only pattern we found that survives the Google-native logging path today.

Cross-references

PR #25500 — fixes defect #1 (outbound response headers)
Issue #24097 — documents defect #2 (streaming callbacks)
PR #25960 — fixes defect #2; supersedes #24114, closes #24097
PR #24114 — earlier fix attempt for #24097 by @awais786; closed by author same day, never merged. Superseded by #25960.
Issue #24945 — adjacent: litellm_metadata overridden on /v1/messages (same family of "non-OpenAI route loses metadata")
PR #24964 — earlier version of #25500 merged into a now-stale staging branch; superseded
PR #25955 — fixes defect #3 (this issue)
PR #25952 — fixes defect #4 (this issue)

extent analysis

TL;DR

The most likely fix involves implementing the missing correlation metadata propagation in the Google-native handlers, as proposed in PR #25955 and PR #25952.

Guidance

Review and merge PR #25955 to fix the inbound metadata propagation issue.
Review and merge PR #25952 to fix the litellm_call_id override issue.
Verify that the fixes resolve the correlation metadata issues by testing the Google-native routes with the updated code.
Consider implementing a temporary workaround using dedicated virtual API keys per scan/worker/run, as described in the issue, until the fixes are fully deployed.

Example

No code snippet is provided, as the issue is complex and requires a thorough review of the proposed PRs.

Notes

The issue is specific to the Google-native handlers and does not affect other proxy routes.
The proposed fixes are still in the review process, and additional testing may be necessary to ensure their correctness.

Recommendation

Apply the workaround using dedicated virtual API keys per scan/worker/run until the fixes are fully deployed and verified, as this provides a reliable correlation pair on the Google-native route.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #model compatibility #GPU setup #container setup #orchestration issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: Google-native :generateContent route silently drops correlation metadata (call_id, tags, user) from spend logs [3 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workaround until fixes land

PR fix notes

PR #25952: fix(proxy): honor client-supplied x-litellm-call-id in spend log request_id

Description (problem / solution / changelog)

Summary

Root cause

Fix

Files changed

Manual testing

Tests

Related

Tracking issue

Changed files

PR #25955: fix(google_genai): propagate user from kwargs to logging obj in agenerate_content

Description (problem / solution / changelog)

Summary

Root cause

Fix

Files changed

Tests

Manual testing

Related

Note: tags propagation

Tracking issue

Changed files

PR #25960: fix(google_genai): route streaming chunks to GeminiPassthroughLoggingHandler so success_callbacks fire (closes #24097, supersedes #24114)

Description (problem / solution / changelog)

Summary

What was broken

The fix

Tests

Differences from PR #24114 (Greptile feedback addressed)

Files changed

Credits

Tracking

Changed files

Code Example

Summary

Affected version

Reproduction

Root cause: Google-native bypass of standard wiring

The four related defects

1. Outbound x-litellm-* response headers missing — TRACKED

2. Streaming success callbacks silently skipped — TRACKED (Fix #2)

3. Inbound metadata / user / tags not propagated to spend log — NEW (Fix #3)

4. litellm_call_id overridden by response_obj.id in spend log — NEW (Fix #1)

Workaround until fixes land

Cross-references

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

1. Outbound `x-litellm-*` response headers missing — TRACKED

3. Inbound `metadata` / `user` / `tags` not propagated to spend log — NEW (Fix #3)

4. `litellm_call_id` overridden by `response_obj.id` in spend log — NEW (Fix #1)