litellm - 💡(How to fix) Fix [Bug]: Responses API bridge double-strips provider prefix from model when GPT-5.4+ request has both tools and reasoning_effort

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

import litellm import httpx from unittest.mock import patch

litellm.set_verbose = True

captured = {}

def capture_post(self, url, **kwargs): captured["url"] = str(url) body = kwargs.get("json") or kwargs.get("content") captured["body"] = body # Short-circuit the call; we only care about the outgoing payload raise httpx.HTTPError("intercepted")

with patch.object(httpx.AsyncClient, "post", capture_post),
patch.object(httpx.Client, "post", capture_post): try: litellm.completion( model="openai/openai/openai/gpt-5.5", messages=[{"role": "user", "content": "hi"}], tools=[{ "type": "function", "function": { "name": "get_weather", "parameters": {"type": "object", "properties": {}}, }, }], reasoning_effort="low", api_base="https://example.invalid", api_key="sk-fake", ) except Exception: pass

print("URL sent to:", captured.get("url")) print("Model field in body:", (captured.get("body") or {}).get("model"))

Expected: openai/openai/gpt-5.5

Actual: openai/gpt-5.5

Root Cause

Root cause (verified against v1.85.1-stable source):

Fix Action

Fix / Workaround

Related (open, not a duplicate): #27975 patches doubled-prefix cost lookup but does not touch the outgoing request payload.

import litellm
import httpx
from unittest.mock import patch

with patch.object(httpx.AsyncClient, "post", capture_post), \
     patch.object(httpx.Client, "post", capture_post):
    try:
        litellm.completion(
            model="openai/openai/openai/gpt-5.5",
            messages=[{"role": "user", "content": "hi"}],
            tools=[{
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "parameters": {"type": "object", "properties": {}},
                },
            }],
            reasoning_effort="low",
            api_base="https://example.invalid",
            api_key="sk-fake",
        )
    except Exception:
        pass

Code Example

import litellm
import httpx
from unittest.mock import patch

litellm.set_verbose = True

captured = {}

def capture_post(self, url, **kwargs):
    captured["url"] = str(url)
    body = kwargs.get("json") or kwargs.get("content")
    captured["body"] = body
    # Short-circuit the call; we only care about the outgoing payload
    raise httpx.HTTPError("intercepted")

with patch.object(httpx.AsyncClient, "post", capture_post), \
     patch.object(httpx.Client, "post", capture_post):
    try:
        litellm.completion(
            model="openai/openai/openai/gpt-5.5",
            messages=[{"role": "user", "content": "hi"}],
            tools=[{
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "parameters": {"type": "object", "properties": {}},
                },
            }],
            reasoning_effort="low",
            api_base="https://example.invalid",
            api_key="sk-fake",
        )
    except Exception:
        pass

print("URL sent to:", captured.get("url"))
print("Model field in body:", (captured.get("body") or {}).get("model"))
# Expected: openai/openai/gpt-5.5
# Actual:   openai/gpt-5.5

---

- model_name: "gpt-5.5"
     litellm_params:
       model: "openai/openai/openai/gpt-5.5"
       api_base: "<any OpenAI-compatible endpoint>"
       api_key: "<key>"

---

curl -X POST http://localhost:4000/v1/chat/completions \
     -H "Authorization: Bearer <key>" \
     -H "Content-Type: application/json" \
     -d '{
       "model": "gpt-5.5",
       "messages": [{"role":"user","content":"hi"}],
       "reasoning_effort": "low",
       "tools": [{"type":"function","function":{"name":"f","parameters":{"type":"object"}}}]
     }'

---

Error from the upstream rejecting the corrupted model name with `key_model_access_denied`:


openai.AuthenticationError: Error code: 401 - {'error': {'message': 'litellm.AuthenticationError: AuthenticationError: OpenAIException - {"error":{"message":"key not allowed to access model. This key can only access models=[\'default-models\']. Tried to access openai/gpt-5.5","type":"key_model_access_denied","param":"model","code":"401"}}. Received Model Group=gpt-5.5\nAvailable Model Group Fallbacks=None', 'type': None, 'param': None, 'code': '401'}}


Note `Tried to access openai/gpt-5.5` — exactly one `openai/` prefix is missing relative to the configured `openai/openai/openai/gpt-5.5`.
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When a Chat Completions request to a GPT-5.4+ model (e.g. gpt-5.4, gpt-5.5, gpt-5.5-pro) contains both tools and reasoning_effort, LiteLLM auto-routes the call through the Responses API bridge (responses_api_bridge_check() in litellm/main.py). The bridge transformer fails to propagate the resolved custom_llm_provider to the downstream litellm.responses() call. responses() then re-runs get_llm_provider() with custom_llm_provider=None, which re-detects the provider from the model string and strips a second provider/ prefix. The model field sent in the outgoing /v1/responses HTTP request is one segment shorter than it should be.

For a deployment configured as model: "openai/openai/openai/gpt-5.5":

  • Normal Chat Completions flow (no bridge): correctly strips one prefix → upstream receives openai/openai/gpt-5.5
  • Bridge flow (tools + reasoning_effort): strips two prefixes → upstream receives openai/gpt-5.5

If the upstream API enforces model-name allow-lists (which is common when LiteLLM is fronting another OpenAI-compatible proxy or gateway), this surfaces as a 401 key_model_access_denied. If it doesn't, the silently-wrong model name still produces incorrect routing or upstream rejection.

Either parameter alone works fine. Only the combination triggers it. The same bug also fires for any second prefix that happens to be in litellm.provider_list (e.g. openai/azure/..., openai/anthropic/...).

Root cause (verified against v1.85.1-stable source):

  1. completion() strips the first prefix correctly at litellm/main.py L1405-1410model="openai/openai/gpt-5.5", custom_llm_provider="openai".
  2. responses_api_bridge_check() returns mode="responses" at main.py L951+ and the bridge handler is invoked at main.py L1693-1707.
  3. The bridge transformer constructs request_data at completion_extras/litellm_responses_transformation/transformation.py L385-457 but omits custom_llm_provider from the kwargs passed to litellm.responses().
  4. responses() calls _resolve_model_provider_for_responses at responses/main.py L978-983, which re-invokes get_llm_provider() with custom_llm_provider=None.
  5. The prefix-strip branch at litellm_core_utils/get_llm_provider_logic.py L234-236 fires again because openai/openai/gpt-5.5 still starts with the recognized provider openai/, producing model="openai/gpt-5.5".
  6. The corrupted model string is written to the HTTP body at llms/openai/responses/transformation.py L132-135.

Suggested fix: propagate custom_llm_provider from the bridge transformer into the litellm.responses() call so the second get_llm_provider() invocation sees an explicit provider and skips the prefix-strip. Alternatively, _resolve_model_provider_for_responses could accept and honor an already-resolved provider passed via kwargs.

Related (open, not a duplicate): #27975 patches doubled-prefix cost lookup but does not touch the outgoing request payload.

Steps to Reproduce

Self-contained SDK repro (no proxy, no live upstream needed — uses an HTTPX mock to capture the outgoing model field):

import litellm
import httpx
from unittest.mock import patch

litellm.set_verbose = True

captured = {}

def capture_post(self, url, **kwargs):
    captured["url"] = str(url)
    body = kwargs.get("json") or kwargs.get("content")
    captured["body"] = body
    # Short-circuit the call; we only care about the outgoing payload
    raise httpx.HTTPError("intercepted")

with patch.object(httpx.AsyncClient, "post", capture_post), \
     patch.object(httpx.Client, "post", capture_post):
    try:
        litellm.completion(
            model="openai/openai/openai/gpt-5.5",
            messages=[{"role": "user", "content": "hi"}],
            tools=[{
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "parameters": {"type": "object", "properties": {}},
                },
            }],
            reasoning_effort="low",
            api_base="https://example.invalid",
            api_key="sk-fake",
        )
    except Exception:
        pass

print("URL sent to:", captured.get("url"))
print("Model field in body:", (captured.get("body") or {}).get("model"))
# Expected: openai/openai/gpt-5.5
# Actual:   openai/gpt-5.5

To reproduce against a real proxy:

  1. Add to model_list in litellm_config.yaml:
    - model_name: "gpt-5.5"
      litellm_params:
        model: "openai/openai/openai/gpt-5.5"
        api_base: "<any OpenAI-compatible endpoint>"
        api_key: "<key>"
  2. Start the proxy and call:
    curl -X POST http://localhost:4000/v1/chat/completions \
      -H "Authorization: Bearer <key>" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "gpt-5.5",
        "messages": [{"role":"user","content":"hi"}],
        "reasoning_effort": "low",
        "tools": [{"type":"function","function":{"name":"f","parameters":{"type":"object"}}}]
      }'
  3. With set_verbose: True or --detailed_debug, observe the outgoing request shows "model": "openai/gpt-5.5" instead of "openai/openai/gpt-5.5".

Removing either reasoning_effort or tools (so only one is present) causes the bridge to not fire, and the upstream correctly receives openai/openai/gpt-5.5.

Relevant log output

Error from the upstream rejecting the corrupted model name with `key_model_access_denied`:


openai.AuthenticationError: Error code: 401 - {'error': {'message': 'litellm.AuthenticationError: AuthenticationError: OpenAIException - {"error":{"message":"key not allowed to access model. This key can only access models=[\'default-models\']. Tried to access openai/gpt-5.5","type":"key_model_access_denied","param":"model","code":"401"}}. Received Model Group=gpt-5.5\nAvailable Model Group Fallbacks=None', 'type': None, 'param': None, 'code': '401'}}


Note `Tried to access openai/gpt-5.5` — exactly one `openai/` prefix is missing relative to the configured `openai/openai/openai/gpt-5.5`.

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.83.7-stable.patch.1

Twitter / LinkedIn details

No response

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING