litellm - 💡(How to fix) Fix [Bug]: Azure gpt-4o rejects `max_tokens` — needs translation to `max_completion_tokens` like o-series/gpt-5 [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24779Fetched 2026-04-08 01:54:04
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×2commented ×1

Error Message

AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

Code Example

AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

---

if "max_tokens" in non_default_params:
    optional_params["max_completion_tokens"] = non_default_params.pop("max_tokens")

---

model_list:
  - model_name: azure-se/gpt-4o
    litellm_params:
      model: azure/gpt-4o
      api_base: https://your-endpoint.openai.azure.com/
      api_key: os.environ/AZURE_API_KEY

---

curl -X POST http://localhost:4000/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure-se/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'

---

litellm.BadRequestError: AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

---

litellm.BadRequestError: AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.
No fallback model group found for original model_group=azure-se/gpt-4o. Fallbacks=[].
RAW_BUFFERClick to expand / collapse

What happened?

Azure OpenAI has deprecated the max_tokens parameter for gpt-4o models. Requests that include max_tokens now fail with:

AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

LiteLLM already handles this translation for o-series models (in o_series_transformation.py) and gpt-5 models (in gpt_5_transformation.py), where max_tokens is automatically mapped to max_completion_tokens:

if "max_tokens" in non_default_params:
    optional_params["max_completion_tokens"] = non_default_params.pop("max_tokens")

However, this translation is missing for the gpt-4o family in gpt_transformation.py. When a client (e.g. Open WebUI) sends max_tokens, LiteLLM passes it through unchanged to Azure, which rejects it.

Steps to Reproduce

  1. Configure an Azure gpt-4o model in LiteLLM proxy:
model_list:
  - model_name: azure-se/gpt-4o
    litellm_params:
      model: azure/gpt-4o
      api_base: https://your-endpoint.openai.azure.com/
      api_key: os.environ/AZURE_API_KEY
  1. Send a request with max_tokens set (e.g. from Open WebUI which sends this parameter by default):
curl -X POST http://localhost:4000/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure-se/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'
  1. Request fails with:
litellm.BadRequestError: AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

Expected Behavior

LiteLLM should automatically translate max_tokensmax_completion_tokens for Azure gpt-4o models, the same way it already does for o-series and gpt-5.

Suggested Fix

Add the same max_tokensmax_completion_tokens mapping to the gpt-4o code path (or to the Azure-specific transformation for models that no longer accept max_tokens).

Relevant log output

litellm.BadRequestError: AzureException BadRequestError - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.
No fallback model group found for original model_group=azure-se/gpt-4o. Fallbacks=[].

What part of LiteLLM is this about?

Proxy, LLM Translation

What LiteLLM version are you on?

main-stable (latest as of 2026-03-30)

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To fix the issue, we need to add the max_tokens to max_completion_tokens mapping for Azure gpt-4o models in gpt_transformation.py. Here are the steps:

  • Open gpt_transformation.py and locate the function that handles Azure gpt-4o models.
  • Add the following code to translate max_tokens to max_completion_tokens:
if "max_tokens" in non_default_params:
    optional_params["max_completion_tokens"] = non_default_params.pop("max_tokens")
  • Ensure this code is executed for Azure gpt-4o models.

Example code snippet:

def transform_gpt_4o_params(non_default_params, optional_params):
    # ... existing code ...
    if "max_tokens" in non_default_params:
        optional_params["max_completion_tokens"] = non_default_params.pop("max_tokens")
    # ... existing code ...

Verification

To verify the fix, send a request with max_tokens set to an Azure gpt-4o model using the updated LiteLLM proxy. The request should succeed, and the response should not contain the BadRequestError message.

Example verification command:

curl -X POST http://localhost:4000/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure-se/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'

Extra Tips

  • Ensure to test the fix with different max_tokens values to verify the translation is working correctly.
  • Consider adding a log message to track when the max_tokens to max_completion_tokens translation occurs for debugging purposes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [Bug]: Azure gpt-4o rejects `max_tokens` — needs translation to `max_completion_tokens` like o-series/gpt-5 [1 comments, 2 participants]