litellm - ✅(Solved) Fix [Bug]: /v1/responses/compact fails during router_settings.fallbacks failover — 'Unknown parameter: metadata' (hardcoded 'metadata' key in run_async_fallback) [1 pull requests, 2 comments, 2 participants]

litellm2026-04-09 05:38:01

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#25402•Fetched 2026-04-10 03:41:19

View on GitHub

Comments

Participants

Timeline

Reactions

Author

beveradb

Participants

beveradb

jeanibarz

Timeline (top)

commented ×2labeled ×2cross-referenced ×1

Error Message

litellm.InternalServerError: This is a mock exception for model=gpt-4.1, to trigger a fallback. Fallbacks=[{'gpt-4.1': ['gpt-4.1-paid']}]. Received Model Group=gpt-4.1 Available Model Group Fallbacks=['gpt-4.1-paid'] Error doing the fallback: litellm.BadRequestError: OpenAIException - { "error": { "message": "Unknown parameter: 'metadata'.", "type": "invalid_request_error", "param": "metadata", "code": "unknown_parameter" } }

Root Cause

Root cause is in litellm/router_utils/fallback_event_handlers.py::run_async_fallback at line 133 (currently identical between tag v1.82.3 and tip of main):

Fix Action

Fixed

Fixed by PR: fix(router): use correct metadata key in run_async_fallback for Responses API (https://github.com/BerriAI/litellm/pull/25454)

PR fix notes

PR #25454: fix(router): use correct metadata key in run_async_fallback for Responses API

Repository: BerriAI/litellm
Author: jeanibarz
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/25454

Description (problem / solution / changelog)

Relevant issues

Fixes #25402

What's the problem?

run_async_fallback in fallback_event_handlers.py hardcodes "metadata" when updating the model_group during fallback:

kwargs.setdefault("metadata", {}).update(
    {"model_group": kwargs.get("model", None)}
)

Responses API routes (/v1/responses, /v1/responses/compact, etc.) use "litellm_metadata" as their metadata key — set by _update_kwargs_before_fallbacks(metadata_variable_name="litellm_metadata") in _ageneric_api_call_with_fallbacks.

The hardcoded "metadata" creates a stale top-level key in kwargs that leaks into the request body. Azure OpenAI rejects it:

Before (fallback on /v1/responses/compact):

400 Unknown parameter: 'metadata'

After (fallback works correctly):

200 OK — metadata written to litellm_metadata, not leaked to provider

Fix

Detect which metadata key is already active in kwargs:

_metadata_key = (
    "litellm_metadata" if "litellm_metadata" in kwargs else "metadata"
)
kwargs.setdefault(_metadata_key, {}).update(...)

This is consistent with how the rest of the router determines the metadata variable name. Since _update_kwargs_before_fallbacks runs before the fallback handler, the correct key is always present in kwargs by the time we reach this code.

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

litellm/router_utils/fallback_event_handlers.py: Detect active metadata key (litellm_metadata vs metadata) instead of hardcoding "metadata"
tests/test_litellm/router_utils/test_fallback_event_handlers.py: 2 regression tests — one for Responses API routes (litellm_metadata), one for standard routes (metadata)

Changed files

litellm/router_utils/fallback_event_handlers.py (modified, +4/-1)
tests/test_litellm/router_utils/test_fallback_event_handlers.py (added, +94/-0)

Code Example

kwargs.setdefault("metadata", {}).update(
    {"model_group": kwargs.get("model", None)}
)  # update model_group used, if fallbacks are done

---

model_list:
  - model_name: gpt-4.1
    litellm_params:
      model: openai/gpt-4.1
      api_base: https://<azure-v1-compat-host>/openai/v1/
      api_key: <primary-key>
  - model_name: gpt-4.1-paid
    litellm_params:
      model: openai/gpt-4.1
      api_base: https://<different-azure-v1-compat-host>/openai/v1/
      api_key: <fallback-key>

router_settings:
  fallbacks:
    - gpt-4.1: [gpt-4.1-paid]
  num_retries: 2
  enable_pre_call_checks: true

---

curl -sS -o /tmp/direct.json -w "STATUS=%{http_code}\n" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  "$PROXY_URL/v1/responses/compact" \
  -d '{"model":"gpt-4.1-paid","input":[{"role":"user","content":"hi"},{"role":"assistant","content":"hello"},{"role":"user","content":"thanks"},{"role":"assistant","content":"sure"},{"role":"user","content":"bye"},{"role":"assistant","content":"bye"}]}'
# STATUS=200

---

curl -sS -o /tmp/resp.json -w "STATUS=%{http_code}\n" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  "$PROXY_URL/v1/responses" \
  -d '{"model":"gpt-4.1","input":"Reply with exactly OK","mock_testing_fallbacks":true}'
# STATUS=200, x-litellm-model-api-base points at fallback host

---

curl -sS -o /tmp/compact.json -w "STATUS=%{http_code}\n" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  "$PROXY_URL/v1/responses/compact" \
  -d '{"model":"gpt-4.1","input":[{"role":"user","content":"hi"},{"role":"assistant","content":"hello"},{"role":"user","content":"thanks"},{"role":"assistant","content":"sure"},{"role":"user","content":"bye"},{"role":"assistant","content":"bye"}],"mock_testing_fallbacks":true}'
# STATUS=500

---

litellm.InternalServerError: This is a mock exception for model=gpt-4.1, to trigger a fallback.
Fallbacks=[{'gpt-4.1': ['gpt-4.1-paid']}].
Received Model Group=gpt-4.1
Available Model Group Fallbacks=['gpt-4.1-paid']
Error doing the fallback: litellm.BadRequestError: OpenAIException - {
  "error": {
    "message": "Unknown parameter: 'metadata'.",
    "type": "invalid_request_error",
    "param": "metadata",
    "code": "unknown_parameter"
  }
}

---

# current (litellm/router_utils/fallback_event_handlers.py:133)
kwargs.setdefault("metadata", {}).update(
    {"model_group": kwargs.get("model", None)}
)

# fixed: pick the right key based on the active responses-API route
from litellm.proxy.litellm_pre_call_utils import LITELLM_METADATA_ROUTES
_metadata_key = (
    "litellm_metadata"
    if kwargs.get("original_function") is not None
       and kwargs["original_function"].__name__ in (
           "acompact_responses",
           "aresponses",
           "aget_responses",
           "adelete_responses",
           "acancel_responses",
           "alist_input_items",
       )
    else "metadata"
)
kwargs.setdefault(_metadata_key, {}).update(
    {"model_group": kwargs.get("model", None)}
)

RAW_BUFFERClick to expand / collapse

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

/v1/responses/compact fails during router_settings.fallbacks failover from a primary model to its fallback, with Azure returning 400 Unknown parameter: 'metadata'. The same failover on /v1/responses (non-compact) works correctly and returns 200 from the fallback deployment.

Root cause is in litellm/router_utils/fallback_event_handlers.py::run_async_fallback at line 133 (currently identical between tag v1.82.3 and tip of main):

kwargs.setdefault("metadata", {}).update(
    {"model_group": kwargs.get("model", None)}
)  # update model_group used, if fallbacks are done

This unconditionally writes to the "metadata" key in kwargs. For Responses API routes (paths containing "responses" per LITELLM_METADATA_ROUTES in litellm/proxy/litellm_pre_call_utils.py:65-70), the proxy's correct metadata-variable name is "litellm_metadata", not "metadata". The fallback path writes a top-level "metadata" dict into kwargs that then gets forwarded as a request-body parameter to the upstream provider. Azure OpenAI's v1-compat /responses/compact endpoint rejects the unknown parameter with HTTP 400.

Non-compact /v1/responses doesn't hit this bug because its code path sanitizes the metadata kwarg before forwarding (Responses API handlers explicitly use litellm_metadata and drop unknown top-level keys). Only compact is affected.

The happy path (no failover, direct deployment selection) works correctly. The fallback path is the only trigger.

Steps to Reproduce

Configure a LiteLLM proxy with at least one Responses-API-capable deployment and its router_settings.fallbacks target:

model_list:
  - model_name: gpt-4.1
    litellm_params:
      model: openai/gpt-4.1
      api_base: https://<azure-v1-compat-host>/openai/v1/
      api_key: <primary-key>
  - model_name: gpt-4.1-paid
    litellm_params:
      model: openai/gpt-4.1
      api_base: https://<different-azure-v1-compat-host>/openai/v1/
      api_key: <fallback-key>

router_settings:
  fallbacks:
    - gpt-4.1: [gpt-4.1-paid]
  num_retries: 2
  enable_pre_call_checks: true

Confirm direct fallback target works:

curl -sS -o /tmp/direct.json -w "STATUS=%{http_code}\n" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  "$PROXY_URL/v1/responses/compact" \
  -d '{"model":"gpt-4.1-paid","input":[{"role":"user","content":"hi"},{"role":"assistant","content":"hello"},{"role":"user","content":"thanks"},{"role":"assistant","content":"sure"},{"role":"user","content":"bye"},{"role":"assistant","content":"bye"}]}'
# STATUS=200

Force a fallback on /v1/responses (non-compact) — works correctly:

curl -sS -o /tmp/resp.json -w "STATUS=%{http_code}\n" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  "$PROXY_URL/v1/responses" \
  -d '{"model":"gpt-4.1","input":"Reply with exactly OK","mock_testing_fallbacks":true}'
# STATUS=200, x-litellm-model-api-base points at fallback host

Force a fallback on /v1/responses/compact — fails:

curl -sS -o /tmp/compact.json -w "STATUS=%{http_code}\n" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -H "Content-Type: application/json" \
  "$PROXY_URL/v1/responses/compact" \
  -d '{"model":"gpt-4.1","input":[{"role":"user","content":"hi"},{"role":"assistant","content":"hello"},{"role":"user","content":"thanks"},{"role":"assistant","content":"sure"},{"role":"user","content":"bye"},{"role":"assistant","content":"bye"}],"mock_testing_fallbacks":true}'
# STATUS=500

Relevant log output

litellm.InternalServerError: This is a mock exception for model=gpt-4.1, to trigger a fallback.
Fallbacks=[{'gpt-4.1': ['gpt-4.1-paid']}].
Received Model Group=gpt-4.1
Available Model Group Fallbacks=['gpt-4.1-paid']
Error doing the fallback: litellm.BadRequestError: OpenAIException - {
  "error": {
    "message": "Unknown parameter: 'metadata'.",
    "type": "invalid_request_error",
    "param": "metadata",
    "code": "unknown_parameter"
  }
}

What part of LiteLLM is this about?

Proxy — Router fallbacks on the /v1/responses/compact endpoint.

What LiteLLM version are you on?

v1.82.3 (proxy runtime). Also verified the same file is byte-identical on main (tip as of 2026-04-09) — fix is not yet upstream.

Proposed fix

In run_async_fallback, derive the metadata-variable name from the call type (or pass it through from the caller) instead of hardcoding "metadata". One minimal approach:

# current (litellm/router_utils/fallback_event_handlers.py:133)
kwargs.setdefault("metadata", {}).update(
    {"model_group": kwargs.get("model", None)}
)

# fixed: pick the right key based on the active responses-API route
from litellm.proxy.litellm_pre_call_utils import LITELLM_METADATA_ROUTES
_metadata_key = (
    "litellm_metadata"
    if kwargs.get("original_function") is not None
       and kwargs["original_function"].__name__ in (
           "acompact_responses",
           "aresponses",
           "aget_responses",
           "adelete_responses",
           "acancel_responses",
           "alist_input_items",
       )
    else "metadata"
)
kwargs.setdefault(_metadata_key, {}).update(
    {"model_group": kwargs.get("model", None)}
)

A cleaner fix would be to pass metadata_variable_name down from async_function_with_fallbacks_common_utils → run_async_fallback, matching how _ageneric_api_call_with_fallbacks already passes metadata_variable_name="litellm_metadata" to _update_kwargs_before_fallbacks (router.py:3703).

Either approach fixes the symptom. Add Azure-compact-fallback regression tests using the existing mock_testing_fallbacks: true harness.

extent analysis

TL;DR

The most likely fix is to modify the run_async_fallback function in litellm/router_utils/fallback_event_handlers.py to derive the metadata-variable name from the call type instead of hardcoding "metadata".

Guidance

Identify the correct metadata key: Determine the correct metadata key based on the active responses-API route. For Responses API routes, the correct key is "litellm_metadata", not "metadata".
Modify the run_async_fallback function: Update the run_async_fallback function to use the correct metadata key. This can be done by checking the original_function name and using the correct key accordingly.
Pass metadata_variable_name down from async_function_with_fallbacks_common_utils: A cleaner fix would be to pass metadata_variable_name down from async_function_with_fallbacks_common_utils to run_async_fallback, matching how _ageneric_api_call_with_fallbacks already passes metadata_variable_name="litellm_metadata" to _update_kwargs_before_fallbacks.
Add regression tests: Add Azure-compact-fallback regression tests using the existing mock_testing_fallbacks: true harness to ensure the fix works correctly.

Example

# fixed: pick the right key based on the active responses-API route
from litellm.proxy.litellm_pre_call_utils import LITELLM_METADATA_ROUTES
_metadata_key = (
    "litellm_metadata"
    if kwargs.get("original_function") is not None
       and kwargs["original_function"].__name__ in (
           "acompact_responses",
           "aresponses",
           "aget_responses",
           "adelete_responses",
           "acancel_responses",
           "alist_input_items",
       )
    else "metadata"
)
kwargs.setdefault(_metadata_key, {}).update(
    {"model_group": kwargs.get("model", None)}
)

Notes

The proposed fix assumes that the original_function name is available in the kwargs dictionary. If this is not the case, an alternative approach may be needed.

Recommendation

Apply the workaround by modifying the run_async_fallback function to use the correct metadata key based on the active responses-API route. This fix should resolve the issue with Azure returning 400 Unknown parameter: 'metadata' when using the /v1/responses/compact endpoint with fallbacks.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #training loop #device allocation #model download #tokenizer error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.