litellm - 💡(How to fix) Fix [Feature]: Azure OpenAI decouple deployment ID from model name via base_model [1 pull requests]

litellm2026-05-21 16:14:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Root Cause

❌ This uses default Azure config because "my-gpt5-deployment" doesn't match

any known model pattern. max_tokens won't be mapped to max_completion_tokens,

and gpt-5-specific behavior is lost.

Fix Action

Fixed

Fixed by PR: [Feature][Bug Fix] Decouple Azure OpenAI Deployment ID from model name via base_model to fix gpt5 model routing (https://github.com/BerriAI/litellm/pull/28490)

Code Example

import litellm

# ===========================================================================
# BEFORE: Custom deployment name — LiteLLM can't detect model type
# ===========================================================================

# ❌ This uses default Azure config because "my-gpt5-deployment" doesn't match
#    any known model pattern. max_tokens won't be mapped to max_completion_tokens,
#    and gpt-5-specific behavior is lost.

# --- SDK direct call (before) ---
response = litellm.completion(
    model="azure/gpt5_series/my-gpt5-deployment",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=100,
    api_key="your-azure-api-key",
    api_base="https://my-resource.openai.azure.com",
    api_version="2024-12-01-preview",
)
# Result: sends max_tokens=100 (incorrect for gpt-5, which expects max_completion_tokens)


# ===========================================================================
# AFTER: Using base_model — LiteLLM detects the correct model type
# ===========================================================================

# ✅ With base_model, LiteLLM knows this is a gpt-5 deployment.
#    max_tokens is automatically mapped to max_completion_tokens,
#    unsupported params (like temperature) are correctly rejected,
#    and the right config class (AzureOpenAIGPT5Config) is used.

# --- SDK direct call (after) ---
response = litellm.completion(
    model="azure/gpt5_series/my-gpt5-deployment",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=100,
    base_model="azure/gpt-5",  # <-- tells LiteLLM the underlying model
    api_key="your-azure-api-key",
    api_base="https://my-resource.openai.azure.com",
    api_version="2024-12-01-preview",
)
# Result: sends max_completion_tokens=100 (correct for gpt-5)

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

The Feature

Creating this issue to link to PR that will be submited shortly.

Azure OpenAI Deployment Id is not underlying model name. If we leverage base_model we can tell LiteLLM exactly which model is being used, this fixes an existing issue with Azure OpenAI model detection logic which breaks if deployment names are not standard. This feature proposes a backward compatible solution that unblocks Azure OpenAI customers.

import litellm

# ===========================================================================
# BEFORE: Custom deployment name — LiteLLM can't detect model type
# ===========================================================================

# ❌ This uses default Azure config because "my-gpt5-deployment" doesn't match
#    any known model pattern. max_tokens won't be mapped to max_completion_tokens,
#    and gpt-5-specific behavior is lost.

# --- SDK direct call (before) ---
response = litellm.completion(
    model="azure/gpt5_series/my-gpt5-deployment",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=100,
    api_key="your-azure-api-key",
    api_base="https://my-resource.openai.azure.com",
    api_version="2024-12-01-preview",
)
# Result: sends max_tokens=100 (incorrect for gpt-5, which expects max_completion_tokens)


# ===========================================================================
# AFTER: Using base_model — LiteLLM detects the correct model type
# ===========================================================================

# ✅ With base_model, LiteLLM knows this is a gpt-5 deployment.
#    max_tokens is automatically mapped to max_completion_tokens,
#    unsupported params (like temperature) are correctly rejected,
#    and the right config class (AzureOpenAIGPT5Config) is used.

# --- SDK direct call (after) ---
response = litellm.completion(
    model="azure/gpt5_series/my-gpt5-deployment",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=100,
    base_model="azure/gpt-5",  # <-- tells LiteLLM the underlying model
    api_key="your-azure-api-key",
    api_base="https://my-resource.openai.azure.com",
    api_version="2024-12-01-preview",
)
# Result: sends max_completion_tokens=100 (correct for gpt-5)

Motivation, pitch

We can better support Azure ecosystem by supporting Azure OpenAI deployments that have arbitrary deployment IDs that may not match the underlying LiteLLM model detection logic. Currently, model-type detection (o-series, gpt-5, etc.) relied on substring matching against the deployment name, causing misrouted configs and rejected params when deployment names were non-standard (e.g. 'my-deployment-id' for gpt-5.2). We need a more reliable method to tell LiteLLM exactly which model is being used which will enable Azure OpenAI models to work with non-standard deployment names.

What part of LiteLLM is this about?

SDK (litellm Python package)

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

Twitter / LinkedIn details

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering