litellm - ✅(Solved) Fix Anthropic passthrough endpoint emits Prometheus metrics with model name missing provider prefix [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#25250Fetched 2026-04-08 03:02:29
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Participants
Timeline (top)
labeled ×1

Root Cause

In the Anthropic passthrough logging handler, kwargs["model"] is populated from the Bedrock API response body, which returns the raw model ID without the bedrock/ LiteLLM provider prefix. The standard /chat/completions path preserves the prefix because the model goes through LiteLLM's router resolution.

The PrometheusLogger.async_log_success_event reads the model from kwargs.get("model", "") (prometheus.py line ~795) and uses it directly as the Prometheus label value, so the inconsistency propagates into metrics.

Fix Action

Workaround

We're currently monkey-patching PrometheusLogger at startup to normalize kwargs["model"] before metric emission — prepending bedrock/ when a bare Bedrock model ID is detected.

PR fix notes

PR #25502: fix(proxy): include provider prefix in kwargs['model'] for Anthropic passthrough metrics

Description (problem / solution / changelog)

Relevant issues

Fixes #25250

Pre-Submission checklist

  • Signed CLA
  • Scope isolated — single bug fix
  • Added testing in tests/test_litellm/
  • Black formatted

Summary

  • _create_anthropic_response_logging_payload set kwargs["model"] to the raw model ID from the API response body (e.g. global.anthropic.claude-opus-4-6-v1) instead of the provider-prefixed form (bedrock/global.anthropic.claude-opus-4-6-v1).
  • This caused Prometheus to emit duplicate metric series for the same underlying model — one from /chat/completions (with prefix) and one from /v1/messages (without).
  • The fix uses the already-computed model_for_cost which carries the correct custom_llm_provider/ prefix.

Test plan

  • Added TestAnthropicPassthroughModelPrefixInKwargs with 3 tests: prefix present, no duplicate prefix, no-provider passthrough
  • All 13/14 tests pass (1 pre-existing failure unrelated to this change)

Changed files

  • litellm/proxy/pass_through_endpoints/llm_provider_handlers/anthropic_passthrough_logging_handler.py (modified, +4/-4)
  • tests/test_litellm/proxy/pass_through_endpoints/llm_provider_handlers/test_anthropic_passthrough_logging_handler.py (modified, +185/-65)

Code Example

model_list:
  - model_name: claude-opus-4-6
    litellm_params:
      model: bedrock/global.anthropic.claude-opus-4-6-v1

litellm_settings:
  callbacks: ["prometheus"]

---

curl -X POST http://localhost:4000/v1/messages \
  -H "x-api-key: sk-..." \
  -H "Content-Type: application/json" \
  -d '{"model": "claude-opus-4-6", "max_tokens": 100, "messages": [{"role": "user", "content": "hello"}]}'

---

curl -X POST http://localhost:4000/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{"model": "claude-opus-4-6", "max_tokens": 100, "messages": [{"role": "user", "content": "hello"}]}'

---

curl http://localhost:4000/metrics | grep litellm_llm_api_latency
RAW_BUFFERClick to expand / collapse

Bug Description

The /v1/messages Anthropic passthrough endpoint sets kwargs["model"] to the raw Bedrock model ID (e.g., global.anthropic.claude-opus-4-6-v1) instead of the provider-prefixed model name (e.g., bedrock/global.anthropic.claude-opus-4-6-v1). This causes duplicate Prometheus metric series on litellm_llm_api_latency_metric — the same underlying model appears under two different model labels.

How to Reproduce

  1. Configure LiteLLM with a Bedrock model and the prometheus callback:
model_list:
  - model_name: claude-opus-4-6
    litellm_params:
      model: bedrock/global.anthropic.claude-opus-4-6-v1

litellm_settings:
  callbacks: ["prometheus"]
  1. Send a request via the Anthropic passthrough endpoint:
curl -X POST http://localhost:4000/v1/messages \
  -H "x-api-key: sk-..." \
  -H "Content-Type: application/json" \
  -d '{"model": "claude-opus-4-6", "max_tokens": 100, "messages": [{"role": "user", "content": "hello"}]}'
  1. Send the same request via the standard endpoint:
curl -X POST http://localhost:4000/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{"model": "claude-opus-4-6", "max_tokens": 100, "messages": [{"role": "user", "content": "hello"}]}'
  1. Check Prometheus metrics:
curl http://localhost:4000/metrics | grep litellm_llm_api_latency

Expected: All metric series show model="bedrock/global.anthropic.claude-opus-4-6-v1"

Actual: Two separate series appear:

  • model="bedrock/global.anthropic.claude-opus-4-6-v1" (from /chat/completions)
  • model="global.anthropic.claude-opus-4-6-v1" (from /v1/messages)

Root Cause

In the Anthropic passthrough logging handler, kwargs["model"] is populated from the Bedrock API response body, which returns the raw model ID without the bedrock/ LiteLLM provider prefix. The standard /chat/completions path preserves the prefix because the model goes through LiteLLM's router resolution.

The PrometheusLogger.async_log_success_event reads the model from kwargs.get("model", "") (prometheus.py line ~795) and uses it directly as the Prometheus label value, so the inconsistency propagates into metrics.

Impact

  • Duplicate metric series for every Bedrock model (doubles metric cardinality)
  • The two series record different subsets of traffic with different latency characteristics, making dashboards and monitors unreliable when grouping by model label
  • The litellm_model and host labels are also N/A on the passthrough series, suggesting the logging payload is incomplete for this code path
  • Affects all Prometheus-based dashboards and monitors that group by model label

Environment

  • LiteLLM version: v1.80.8.rc.1
  • Provider: AWS Bedrock (cross-region inference profiles)
  • Callbacks: prometheus, datadog_llm_observability

Workaround

We're currently monkey-patching PrometheusLogger at startup to normalize kwargs["model"] before metric emission — prepending bedrock/ when a bare Bedrock model ID is detected.

extent analysis

TL;DR

Update the Anthropic passthrough endpoint to prepend the bedrock/ prefix to the kwargs["model"] value before logging metrics.

Guidance

  • Verify that the kwargs["model"] value is being populated correctly from the Bedrock API response body in the Anthropic passthrough logging handler.
  • Check the prometheus.py file, specifically around line 795, to ensure that the PrometheusLogger.async_log_success_event method is using the correct model value from kwargs.
  • Consider implementing a fix similar to the existing monkey-patch workaround, but as a permanent solution, to prepend the bedrock/ prefix to the kwargs["model"] value when a bare Bedrock model ID is detected.
  • Test the fix by sending requests via both the Anthropic passthrough endpoint and the standard endpoint, and verify that the Prometheus metrics show only one series with the correct model label.

Example

# Example of how to prepend the bedrock/ prefix to kwargs["model"]
if not kwargs["model"].startswith("bedrock/"):
    kwargs["model"] = "bedrock/" + kwargs["model"]

Notes

This fix assumes that the bedrock/ prefix is always required for Bedrock models. If this is not the case, additional logic may be needed to determine when to prepend the prefix.

Recommendation

Apply a workaround by prepending the bedrock/ prefix to the kwargs["model"] value, as this will resolve the issue with duplicate metric series and ensure accurate logging.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING