litellm - 💡(How to fix) Fix [Bug]: Azure gpt-4o rejects `max_tokens` — needs translation to `max_completion_tokens` like o-series/gpt-5 [1 comments, 2 participants]

litellm2026-03-30 09:34:47

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24779•Fetched 2026-04-08 01:54:04

View on GitHub

Comments

Participants

Timeline

Reactions

Author

hvdlinden

Participants

hvdlinden

kkauy

Timeline (top)

labeled ×2commented ×1

Error Message

AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

Code Example

AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

---

if "max_tokens" in non_default_params:
    optional_params["max_completion_tokens"] = non_default_params.pop("max_tokens")

---

model_list:
  - model_name: azure-se/gpt-4o
    litellm_params:
      model: azure/gpt-4o
      api_base: https://your-endpoint.openai.azure.com/
      api_key: os.environ/AZURE_API_KEY

---

curl -X POST http://localhost:4000/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure-se/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'

---

litellm.BadRequestError: AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

---

litellm.BadRequestError: AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.
No fallback model group found for original model_group=azure-se/gpt-4o. Fallbacks=[].

RAW_BUFFERClick to expand / collapse

What happened?

Azure OpenAI has deprecated the max_tokens parameter for gpt-4o models. Requests that include max_tokens now fail with:

AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

LiteLLM already handles this translation for o-series models (in o_series_transformation.py) and gpt-5 models (in gpt_5_transformation.py), where max_tokens is automatically mapped to max_completion_tokens:

if "max_tokens" in non_default_params:
    optional_params["max_completion_tokens"] = non_default_params.pop("max_tokens")

However, this translation is missing for the gpt-4o family in gpt_transformation.py. When a client (e.g. Open WebUI) sends max_tokens, LiteLLM passes it through unchanged to Azure, which rejects it.

Steps to Reproduce

Configure an Azure gpt-4o model in LiteLLM proxy:

model_list:
  - model_name: azure-se/gpt-4o
    litellm_params:
      model: azure/gpt-4o
      api_base: https://your-endpoint.openai.azure.com/
      api_key: os.environ/AZURE_API_KEY

Send a request with max_tokens set (e.g. from Open WebUI which sends this parameter by default):

curl -X POST http://localhost:4000/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure-se/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'

Request fails with:

litellm.BadRequestError: AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

Expected Behavior

LiteLLM should automatically translate max_tokens → max_completion_tokens for Azure gpt-4o models, the same way it already does for o-series and gpt-5.

Suggested Fix

Add the same max_tokens → max_completion_tokens mapping to the gpt-4o code path (or to the Azure-specific transformation for models that no longer accept max_tokens).

Relevant log output

litellm.BadRequestError: AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.
No fallback model group found for original model_group=azure-se/gpt-4o. Fallbacks=[].

What part of LiteLLM is this about?

Proxy, LLM Translation

What LiteLLM version are you on?

main-stable (latest as of 2026-03-30)

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue, we need to add the max_tokens to max_completion_tokens mapping for Azure gpt-4o models in gpt_transformation.py. Here are the steps:

Open gpt_transformation.py and locate the function that handles Azure gpt-4o models.
Add the following code to translate max_tokens to max_completion_tokens:

if "max_tokens" in non_default_params:
    optional_params["max_completion_tokens"] = non_default_params.pop("max_tokens")

Ensure this code is executed for Azure gpt-4o models.

Example code snippet:

def transform_gpt_4o_params(non_default_params, optional_params):
    # ... existing code ...
    if "max_tokens" in non_default_params:
        optional_params["max_completion_tokens"] = non_default_params.pop("max_tokens")
    # ... existing code ...

Verification

To verify the fix, send a request with max_tokens set to an Azure gpt-4o model using the updated LiteLLM proxy. The request should succeed, and the response should not contain the BadRequestError message.

Example verification command:

curl -X POST http://localhost:4000/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure-se/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'

Extra Tips

Ensure to test the fix with different max_tokens values to verify the translation is working correctly.
Consider adding a log message to track when the max_tokens to max_completion_tokens translation occurs for debugging purposes.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #API middleware #SSR setup #ISR setup #authentication setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix [Bug]: Azure gpt-4o rejects `max_tokens` — needs translation to `max_completion_tokens` like o-series/gpt-5 [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

What happened?

Steps to Reproduce

Expected Behavior

Suggested Fix

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on?

Twitter / LinkedIn details

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix [Bug]: Azure gpt-4o rejects `max_tokens` — needs translation to `max_completion_tokens` like o-series/gpt-5 [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

What happened?

Steps to Reproduce

Expected Behavior

Suggested Fix

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on?

Twitter / LinkedIn details

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING