litellm - ✅(Solved) Fix [Bug]: Fallback reuses mutated kwargs after Bedrock timeout — `tool_choice` sent without `tools` to Azure OpenAI [1 pull requests, 1 participants]

litellm2026-03-29 19:13:11

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24764•Fetched 2026-04-08 01:49:18

View on GitHub

Comments

Participants

Timeline

Reactions

Author

karllasnascimento

Participants

karllasnascimento

Timeline (top)

labeled ×3cross-referenced ×2referenced ×2

Error Message

openai.BadRequestError: 400 - 'tool_choice' is only allowed when 'tools' are specified

Root Cause

In run_async_fallback() (fallback_event_handlers.py), the fallback loop iterates over fallback_model_group and reuses the same kwargs dict for each attempt:

for mg in fallback_model_group:
    # ...
    kwargs["model"] = mg
    response = await litellm_router.async_function_with_fallbacks(*args, **kwargs)

No deep copy of kwargs is made before each fallback attempt. If the previous provider's handler mutated kwargs (or any nested dict within it), the fallback model receives corrupted parameters.

Note on safe_deep_copy: While converse_transformation.py does use safe_deep_copy inside _prepare_request_params(), this only protects optional_params during the Bedrock-specific transformation. It does not protect the kwargs dict that is reused by run_async_fallback() across different providers. The converse_handler.py still performs destructive .pop() calls on the original optional_params dict before calling _async_transform_request() (for keys like stream, model_id, fake_stream, aws_access_key_id, etc. — see converse_handler.py lines 267–317).

Additionally, other parts of the call chain (e.g., completion() in main.py, _acompletion() in router.py) may also modify kwargs in ways that leave them in an inconsistent state for the fallback provider.

Fix Action

Fixed

Fixed by PR: fix: deep copy kwargs in run_async_fallback to prevent mutation across providers (https://github.com/BerriAI/litellm/pull/24768)

PR fix notes

PR #24768: fix: deep copy kwargs in run_async_fallback to prevent mutation across providers

Repository: BerriAI/litellm
Author: karllasnascimento
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/24768

Description (problem / solution / changelog)

Relevant issues

Fixes #24764

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run Link:
CI run for the last commit Link:
Merge / cherry-pick CI run Links:

Type

🐛 Bug Fix

Changes

When run_async_fallback iterates over fallback model groups, provider-specific transformations (e.g., Bedrock's converse_handler popping tools and tool_choice from optional_params) mutate the shared kwargs dict in place. This causes subsequent fallback providers (e.g., Azure OpenAI) to receive corrupted kwargs — specifically, tool_choice without tools, triggering a 400 error.

This PR adds a safe_deep_copy(kwargs) call at the start of each fallback attempt inside run_async_fallback, ensuring each provider gets a clean copy of kwargs. This follows the same pattern already used in litellm/litellm_core_utils/fallback_utils.py.

Files changed:

litellm/router_utils/fallback_event_handlers.py: Import safe_deep_copy and call kwargs = safe_deep_copy(kwargs) at the beginning of the try block in the fallback loop.
tests/test_litellm/test_fallback_kwargs_mutation.py: Two new mocked tests verifying that kwargs are not mutated between fallback attempts and that the caller's original kwargs remain unchanged.

Related issues:

#24051 — Gemini schema mutation in add_object_type breaks fallback to OpenAI (same family of bug, different provider)
#10136 — LiteLLM Router carries over completion parameters across requests

Changed files

litellm/router_utils/fallback_event_handlers.py (modified, +13/-0)
tests/test_litellm/test_fallback_kwargs_mutation.py (added, +183/-0)

Code Example

openai.BadRequestError: 400 - 'tool_choice' is only allowed when 'tools' are specified

---

for mg in fallback_model_group:
    # ...
    kwargs["model"] = mg
    response = await litellm_router.async_function_with_fallbacks(*args, **kwargs)

---

import copy

for mg in fallback_model_group:
    if mg == original_model_group:
        continue
    try:
        fallback_kwargs = copy.deepcopy(kwargs)
        fallback_kwargs = litellm_router.log_retry(kwargs=fallback_kwargs, e=original_exception)
        fallback_kwargs["model"] = mg
        # ... rest of the loop using fallback_kwargs instead of kwargs

---

# If tool_choice is present but tools is not, remove tool_choice
if "tool_choice" in fallback_kwargs and "tools" not in fallback_kwargs:
    fallback_kwargs.pop("tool_choice", None)

---

model_list:
  - model_name: my-model
    litellm_params:
      model: bedrock/openai.gpt-oss-20b-1:0
      # ... aws credentials

  - model_name: gpt-4o-mini-2024-07-18
    litellm_params:
      model: azure/gpt-4o-mini-2024-07-18
      # ... azure credentials

router_settings:
  timeout: 120
  num_retries: 2
  default_fallbacks: ["gpt-4o-mini-2024-07-18"]

---

import litellm

response = await litellm.acompletion(
    model="my-model",
    messages=[{"role": "user", "content": "Classify this text"}],
    tools=[{"type": "function", "function": {"name": "text_classification", "parameters": {"type": "object", "properties": {"label": {"type": "string"}}}}}],
    tool_choice={"type": "function", "function": {"name": "text_classification"}},
)

---

openai.BadRequestError: 400 - 'tool_choice' is only allowed when 'tools' are specified

---

Stack trace from the fallback attempt (Azure OpenAI, ~98.6ms):

litellm/main.py line 620 → acompletion
  llms/azure/azure.py line 491 → acompletion
    llms/azure/azure.py line 172 → azure_client.chat.completions.create(**data, timeout=timeout)
      → openai.BadRequestError: 400 - 'tool_choice' is only allowed when 'tools' are specified

The `data` dict that reaches Azure no longer contains `tools`, only `tool_choice`.

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When a Bedrock model times out and the router triggers a fallback to Azure OpenAI, the fallback request fails with:

openai.BadRequestError: 400 - 'tool_choice' is only allowed when 'tools' are specified

What happens:

A request arrives with both tools and tool_choice parameters
LiteLLM routes it to Bedrock (bedrock/converse/...)
During Bedrock request processing, converse_handler.py pops several keys from optional_params (e.g., stream, aws_access_key_id, etc.) — these .pop() calls mutate the shared kwargs dict
Bedrock times out (e.g., 120s × 3 retries)
run_async_fallback() in fallback_event_handlers.py reuses the same kwargs dict without making a deep copy
The fallback model (Azure OpenAI) receives kwargs in an inconsistent state — specifically, tool_choice is present but tools is missing
Azure OpenAI rejects the request with HTTP 400

Expected behavior:

The fallback should receive the original, unmodified kwargs so it can process the request with both tools and tool_choice intact, returning a structured tool_calls response identical to what the primary model would have returned.

Root cause analysis

In run_async_fallback() (fallback_event_handlers.py), the fallback loop iterates over fallback_model_group and reuses the same kwargs dict for each attempt:

for mg in fallback_model_group:
    # ...
    kwargs["model"] = mg
    response = await litellm_router.async_function_with_fallbacks(*args, **kwargs)

No deep copy of kwargs is made before each fallback attempt. If the previous provider's handler mutated kwargs (or any nested dict within it), the fallback model receives corrupted parameters.

Proposed fix

Primary fix: make a deep copy of kwargs before each fallback attempt in run_async_fallback():

import copy

for mg in fallback_model_group:
    if mg == original_model_group:
        continue
    try:
        fallback_kwargs = copy.deepcopy(kwargs)
        fallback_kwargs = litellm_router.log_retry(kwargs=fallback_kwargs, e=original_exception)
        fallback_kwargs["model"] = mg
        # ... rest of the loop using fallback_kwargs instead of kwargs

Optional additional safeguard: a sanitization check to catch any remaining edge cases where this inconsistency could arise:

# If tool_choice is present but tools is not, remove tool_choice
if "tool_choice" in fallback_kwargs and "tools" not in fallback_kwargs:
    fallback_kwargs.pop("tool_choice", None)

Steps to Reproduce

Configure a Bedrock model as primary with Azure OpenAI models as default_fallbacks:

model_list:
  - model_name: my-model
    litellm_params:
      model: bedrock/openai.gpt-oss-20b-1:0
      # ... aws credentials

  - model_name: gpt-4o-mini-2024-07-18
    litellm_params:
      model: azure/gpt-4o-mini-2024-07-18
      # ... azure credentials

router_settings:
  timeout: 120
  num_retries: 2
  default_fallbacks: ["gpt-4o-mini-2024-07-18"]

Send a request with tools and tool_choice:

import litellm

response = await litellm.acompletion(
    model="my-model",
    messages=[{"role": "user", "content": "Classify this text"}],
    tools=[{"type": "function", "function": {"name": "text_classification", "parameters": {"type": "object", "properties": {"label": {"type": "string"}}}}}],
    tool_choice={"type": "function", "function": {"name": "text_classification"}},
)

If the Bedrock model times out, the fallback to Azure fails with:

openai.BadRequestError: 400 - 'tool_choice' is only allowed when 'tools' are specified

Relevant log output

Stack trace from the fallback attempt (Azure OpenAI, ~98.6ms):

litellm/main.py line 620 → acompletion
  llms/azure/azure.py line 491 → acompletion
    llms/azure/azure.py line 172 → azure_client.chat.completions.create(**data, timeout=timeout)
      → openai.BadRequestError: 400 - 'tool_choice' is only allowed when 'tools' are specified

The `data` dict that reaches Azure no longer contains `tools`, only `tool_choice`.

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.3-stable (also reproduced on v1.81.14-stable)

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To resolve the issue, we need to make a deep copy of kwargs before each fallback attempt in run_async_fallback(). Here are the steps:

Import the copy module: import copy
Create a deep copy of kwargs for each fallback attempt:

for mg in fallback_model_group:
    if mg == original_model_group:
        continue
    try:
        fallback_kwargs = copy.deepcopy(kwargs)
        fallback_kwargs = litellm_router.log_retry(kwargs=fallback_kwargs, e=original_exception)
        fallback_kwargs["model"] = mg
        # ... rest of the loop using fallback_kwargs instead of kwargs

Optionally, add a sanitization check to catch any remaining edge cases:

# If tool_choice is present but tools is not, remove tool_choice
if "tool_choice" in fallback_kwargs and "tools" not in fallback_kwargs:
    fallback_kwargs.pop("tool_choice", None)

Verification

To verify the fix, follow these steps:

Configure a Bedrock model as primary with Azure OpenAI models as default_fallbacks.
Send a request with tools and tool_choice.
If the Bedrock model times out, the fallback to Azure should now succeed without raising a BadRequestError.

Extra Tips

Make sure to test the fix with different scenarios, including various tools and tool_choice configurations.
Consider adding additional logging to monitor the kwargs dictionary and ensure it is being properly copied and sanitized during fallback attempts.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#indexing error #inference speed #output truncation #response parsing #generation error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.