litellm - ✅(Solved) Fix [Bug]: azure_ai/mistral-large-3 does not support max_completion_tokens [1 pull requests, 1 comments, 2 participants]

litellm2026-04-23 09:41:15

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#26322•Fetched 2026-04-24 05:52:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Slyvred

Participants

jeanibarz

Slyvred

Timeline (top)

labeled ×3referenced ×2commented ×1cross-referenced ×1

Error Message

{ "error": { "message": "litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3\nAvailable Model Group Fallbacks=None", "type": null, "param": null, "code": "500" } }

Fix Action

Fixed

Fixed by PR: fix(azure_ai): map max_completion_tokens to max_tokens for Model Inference endpoint (https://github.com/BerriAI/litellm/pull/26344)

PR fix notes

PR #26344: fix(azure_ai): map max_completion_tokens to max_tokens for Model Inference endpoint

Repository: BerriAI/litellm
Author: jeanibarz
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/26344

Description (problem / solution / changelog)

Relevant issues

Fixes #26322

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

Branch creation CI run Link:
CI run for the last commit Link:
Merge / cherry-pick CI run Links:

Screenshots / Proof of Fix

Backend parameter-mapping fix — no UI surface. Proof is in the unit tests and the before/after trace below.

Before (upstream main at e9e86ed956):

>>> from litellm.llms.azure_ai.chat.transformation import AzureAIStudioConfig
>>> AzureAIStudioConfig().map_openai_params(
...     {"max_completion_tokens": 1000}, {}, "mistral-large-3", drop_params=False
... )
{'max_completion_tokens': 1000}   # Azure AI Foundry rejects this with 422

After this PR:

>>> AzureAIStudioConfig().map_openai_params(
...     {"max_completion_tokens": 1000}, {}, "mistral-large-3", drop_params=False
... )
{'max_tokens': 1000}              # accepted by /models/chat/completions

Targeted unit tests (9 in total for this override, 5 general + 4 parametrized for OpenAI-family models) all pass:

tests/test_litellm/llms/azure_ai/chat/test_azure_ai_transformation.py
  test_azure_ai_map_openai_params_renames_max_completion_tokens PASSED
  test_azure_ai_map_openai_params_preserves_max_tokens PASSED
  test_azure_ai_map_openai_params_max_completion_tokens_takes_priority PASSED
  test_azure_ai_map_openai_params_does_not_mutate_caller_dict PASSED
  test_azure_ai_map_openai_params_handles_openai_family_models[o3-mini-...] PASSED  (x4)
  test_azure_model_router_inherits_max_completion_tokens_rewrite PASSED

Broader tests/test_litellm/llms/azure_ai/ suite: 105/105 passing locally. Ruff + MyPy + Black all clean on the touched files.

Type

Bug Fix

Changes

Root cause. AzureAIStudioConfig in litellm/llms/azure_ai/chat/transformation.py inherits map_openai_params from OpenAIConfig unchanged. The parent dispatches by model name — the default path copies max_completion_tokens through as-is, and the o-series and gpt-5 sub-configs actively rename max_tokens -> max_completion_tokens. Either way, the value headed for Azure AI Foundry's Model Inference endpoint (/models/chat/completions) ends up as max_completion_tokens, which that endpoint's request schema does not accept (see Microsoft Learn - the schema lists only max_tokens). The endpoint responds with 422 Unprocessable Entity, which is what the issue reporter hit on azure_ai/mistral-large-3.

Fix. Override map_openai_params in AzureAIStudioConfig and normalize AFTER the parent mapping: delegate to super().map_openai_params(...) first, then rename max_completion_tokens -> max_tokens on the resulting dict. Normalizing at the end (rather than before the super() call) keeps the fix robust for every parent dispatch path — Mistral/Llama-family (default copy), o-series (o3-mini, o1-*, ...), and gpt-5-* — all end up with max_tokens on the wire.

Precedence. If a caller passes both max_tokens and max_completion_tokens, max_completion_tokens wins. This matches the existing priority in MistralConfig.map_openai_params (litellm/llms/mistral/chat/transformation.py:154-167).

Scope note. The override lives on AzureAIStudioConfig, so it transparently also covers AzureModelRouterConfig, which subclasses it and targets the same Model Inference endpoint. Azure OpenAI Service (azure/...) is unaffected — _get_openai_compatible_provider_info reroutes known OpenAI-catalog models to the azure provider, which uses AzureConfig with its own separate parameter handling.

Files changed:

litellm/llms/azure_ai/chat/transformation.py — new map_openai_params override (+25 lines)
tests/test_litellm/llms/azure_ai/chat/test_azure_ai_transformation.py — 6 new mocked unit tests, one parametrized across 4 model names (+115 lines)

Changed files

litellm/llms/azure_ai/chat/transformation.py (modified, +25/-0)
tests/test_litellm/llms/azure_ai/chat/test_azure_ai_transformation.py (modified, +115/-0)

Code Example

{
  "error": {
    "message": "litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3\nAvailable Model Group Fallbacks=None",
    "type": null,
    "param": null,
    "code": "500"
  }
}

---

curl --location 'http://localhost:4000/v1/chat/completions' --header 'Content-Type: application/json' --header 'Authorization: Bearer sk-1234' --data '{
    "model": "mistral-large-3",
    "messages": [
        {
            "role": "system",
            "content": [
                {
                    "type": "text",
                    "text": "Faux contexte"
                }
            ]
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "bonjour"
                }
            ]
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "max_completion_tokens": 1000,
    "litellm_request_debug": true
}'

---

07:41:02 - LiteLLM:WARNING: litellm_logging.py:1156 -

POST Request Sent from LiteLLM:
curl -X POST \
https://redacted.services.ai.azure.com/models/chat/completions?api-version=2025-04-01-preview \
-H 'REDACTED' -H 'Content-Type: application/json' \
-d '{'model': 'mistral-large-3', 'messages': [{'role': 'system', 'content': 'Faux contexte'}, {'role': 'user', 'content': 'bonjour'}], 'stream': True, 'stream_options': {'include_usage': True}, 'max_completion_tokens': 1000}

---

08:36:16 - LiteLLM:WARNING: litellm_logging.py:1156 -

POST Request Sent from LiteLLM:
curl -X POST \
https://api.mistral.ai/v1/chat/completions \
-H 'Authorization: Be****Df' -H 'Content-Type: application/json' \
-d '{'model': 'mistral-large-2512', 'messages': [{'role': 'system', 'content': 'Faux contexte'}, {'role': 'user', 'content': 'bonjour'}], 'stream': True, 'max_tokens': 1000}'

---

07:41:02 - LiteLLM:WARNING: litellm_logging.py:1156 -

POST Request Sent from LiteLLM:
curl -X POST \
https://{REDACTED}.services.ai.azure.com/models/chat/completions?api-version=2025-04-01-preview \
-H 'REDACTED' -H 'Content-Type: application/json' \
-d '{'model': 'mistral-large-3', 'messages': [{'role': 'system', 'content': 'Faux contexte'}, {'role': 'user', 'content': 'bonjour'}], 'stream': True, 'stream_options': {'include_usage': True}, 'max_completion_tokens': 1000}
[...]
07:41:02 - LiteLLM Router:INFO: router.py:2233 - litellm.acompletion(model=azure_ai/mistral-large-3) Exception litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209)
07:41:02 - LiteLLM Router:DEBUG: router.py:5332 - TracebackTraceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 176, in _make_common_async_call
    response = await async_httpx_client.post(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/logging_utils.py", line 297, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post
    response.raise_for_status()
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 829, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '422 Unprocessable Entity' for url 'https://REDACTED.services.ai.azure.com/models/chat/completions?api-version=2025-04-01-preview'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 620, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 694, in acompletion_stream_function
    completion_stream, _response_headers = await self.make_async_call_stream_helper(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<15 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 753, in make_async_call_stream_helper
    response = await self._make_common_async_call(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 195, in _make_common_async_call
    provider_config.transform_request_on_unprocessable_entity_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        e=e, request_data=data
        ^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/azure_ai/chat/transformation.py", line 315, in transform_request_on_unprocessable_entity_error
    data = drop_params_from_unprocessable_entity_error(e=e, data=request_data)
  File "/usr/lib/python3.13/site-packages/litellm/llms/openai/common_utils.py", line 98, in drop_params_from_unprocessable_entity_error
    error_json = e.response.json()
  File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 832, in json
    return jsonlib.loads(self.content, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/json/__init__.py", line 352, in loads
    return _default_decoder.decode(s)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/usr/lib/python3.13/json/decoder.py", line 348, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 209)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5604, in async_function_with_fallbacks
    response = await self.async_function_with_retries(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5759, in async_function_with_retries
    self.should_retry_this_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        error=e,
        ^^^^^^^^
    ...<4 lines>...
        content_policy_fallbacks=content_policy_fallbacks,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5989, in should_retry_this_error
    raise error
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5710, in async_function_with_retries
    response = await self.make_call(original_function, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5878, in make_call
    response = await response
               ^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2241, in _acompletion
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2187, in _acompletion
    response = await _response
               ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 2093, in wrapper_async
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1892, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 639, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2461, in exception_type
    raise e  # it's already mapped
    ^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 635, in exception_type
    raise APIConnectionError(
    ...<7 lines>...
    )
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209)

07:41:02 - LiteLLM Router:INFO: router.py:5416 - Trying to fallback b/w models
07:41:02 - LiteLLM Proxy:ERROR: common_request_processing.py:1570 - litellm.proxy.proxy_server._handle_llm_api_exception(): Exception occured - litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3
Available Model Group Fallbacks=None
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 176, in _make_common_async_call
    response = await async_httpx_client.post(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/logging_utils.py", line 297, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post
    response.raise_for_status()
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 829, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '422 Unprocessable Entity' for url 'https://REDACTED.services.ai.azure.com/models/chat/completions?api-version=2025-04-01-preview'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 620, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 694, in acompletion_stream_function
    completion_stream, _response_headers = await self.make_async_call_stream_helper(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<15 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 753, in make_async_call_stream_helper
    response = await self._make_common_async_call(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 195, in _make_common_async_call
    provider_config.transform_request_on_unprocessable_entity_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        e=e, request_data=data
        ^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/azure_ai/chat/transformation.py", line 315, in transform_request_on_unprocessable_entity_error
    data = drop_params_from_unprocessable_entity_error(e=e, data=request_data)
  File "/usr/lib/python3.13/site-packages/litellm/llms/openai/common_utils.py", line 98, in drop_params_from_unprocessable_entity_error
    error_json = e.response.json()
  File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 832, in json
    return jsonlib.loads(self.content, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/json/__init__.py", line 352, in loads
    return _default_decoder.decode(s)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/usr/lib/python3.13/json/decoder.py", line 348, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 209)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 7145, in chat_completion
    result = await base_llm_response_processor.base_process_llm_request(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<16 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/proxy/common_request_processing.py", line 1065, in base_process_llm_request
    responses = await llm_responses
                ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1741, in acompletion
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1717, in acompletion
    response = await self.async_function_with_fallbacks(**kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5613, in async_function_with_fallbacks
    return await self.async_function_with_fallbacks_common_utils(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<8 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5570, in async_function_with_fallbacks_common_utils
    raise original_exception
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5604, in async_function_with_fallbacks
    response = await self.async_function_with_retries(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5759, in async_function_with_retries
    self.should_retry_this_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        error=e,
        ^^^^^^^^
    ...<4 lines>...
        content_policy_fallbacks=content_policy_fallbacks,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5989, in should_retry_this_error
    raise error
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5710, in async_function_with_retries
    response = await self.make_call(original_function, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5878, in make_call
    response = await response
               ^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2241, in _acompletion
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2187, in _acompletion
    response = await _response
               ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 2093, in wrapper_async
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1892, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 639, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2461, in exception_type
    raise e  # it's already mapped
    ^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 635, in exception_type
    raise APIConnectionError(
    ...<7 lines>...
    )
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3
Available Model Group Fallbacks=None
07:41:02 - LiteLLM Proxy:DEBUG: db_spend_update_writer.py:116 - Enters prisma db call, response_cost: 0.0, token: 88dc28d0f030c55ed4ab77ed8faf098196cb1c05df778539800c9f1243fe6b4b; user_id: default_user_id; team_id: None
07:41:02 - LiteLLM Proxy:DEBUG: spend_tracking_utils.py:116 - getting payload for SpendLogs, available keys in metadata: ['user_api_key_user_id', 'headers', 'requester_metadata', 'user_api_key_hash', 'user_api_key_alias', 'user_api_key_spend', 'user_api_key_max_budget', 'user_api_key_team_id', 'user_api_key_project_id', 'user_api_key_project_alias', 'user_api_key_org_id', 'user_api_key_org_alias', 'user_api_key_team_alias', 'user_api_key_end_user_id', 'user_api_key_user_email', 'user_api_key_request_route', 'user_api_key_budget_reset_at', 'user_api_key_auth_metadata', 'user_api_key', 'agent_id', 'user_api_end_user_max_budget', 'user_api_key_auth', 'litellm_api_version', 'global_max_parallel_requests', 'user_api_key_team_max_budget', 'user_api_key_team_spend', 'user_api_key_model_max_budget', 'user_api_key_end_user_model_max_budget', 'user_api_key_user_spend', 'user_api_key_user_max_budget', 'user_api_key_metadata', 'user_api_key_team_metadata', 'user_api_key_object_permission_id', 'user_api_key_team_object_permission_id', 'endpoint', 'litellm_parent_otel_span', 'requester_ip_address', 'user_agent', 'queue_time_seconds', 'model_group', 'model_group_alias', 'model_group_size', 'attempted_retries', 'max_retries', 'deployment', 'model_info', 'api_base', 'deployment_model_name', 'caching_groups', 'status', 'error_information']
07:41:02 - LiteLLM Proxy:DEBUG: spend_tracking_utils.py:480 - SpendTable: created payload - request_id: 42af16c0-9c64-4423-8de5-6cd6cc6e02f2, model: azure_ai/mistral-large-3, spend: 0
07:41:02 - LiteLLM Proxy:DEBUG: db_spend_update_writer.py:723 - Writing spend log to db - request_id: 42af16c0-9c64-4423-8de5-6cd6cc6e02f2, spend: 0.0
07:41:02 - LiteLLM Proxy:DEBUG: db_spend_update_writer.py:190 - Runs spend update on all tables
07:41:02 - LiteLLM Proxy:DEBUG: common_request_processing.py:1583 - An error occurred: litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3
Available Model Group Fallbacks=None
Model: azure_ai/mistral-large-3
API Base: `https://REDACTED.services.ai.azure.com/`
Messages: `[{'role': 'system', 'content': [{'type': 'text', 'text': 'Faux contexte'}]}, {'role': 'user', 'conte`
model_group: `mistral-large-3`

deployment: `azure_ai/mistral-large-3`


 Debug this by setting `--debug`, e.g. `litellm --model gpt-3.5-turbo --debug`
INFO:     172.19.0.1:60786 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

This issue is closely related to the issue #9583.

Description: Systematic error encountered when calling mistral-large-3 hosted on Microsoft Foundry with the parameter max_completion_tokens

{
  "error": {
    "message": "litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3\nAvailable Model Group Fallbacks=None",
    "type": null,
    "param": null,
    "code": "500"
  }
}

This error is only encountered with this model. When switching to any other model the following requests works flawlessly:

curl --location 'http://localhost:4000/v1/chat/completions' --header 'Content-Type: application/json' --header 'Authorization: Bearer sk-1234' --data '{
    "model": "mistral-large-3",
    "messages": [
        {
            "role": "system",
            "content": [
                {
                    "type": "text",
                    "text": "Faux contexte"
                }
            ]
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "bonjour"
                }
            ]
        }
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "max_completion_tokens": 1000,
    "litellm_request_debug": true
}'

In the logs, litellm sends the following request to azure:

07:41:02 - LiteLLM:WARNING: litellm_logging.py:1156 -

POST Request Sent from LiteLLM:
curl -X POST \
https://redacted.services.ai.azure.com/models/chat/completions?api-version=2025-04-01-preview \
-H 'REDACTED' -H 'Content-Type: application/json' \
-d '{'model': 'mistral-large-3', 'messages': [{'role': 'system', 'content': 'Faux contexte'}, {'role': 'user', 'content': 'bonjour'}], 'stream': True, 'stream_options': {'include_usage': True}, 'max_completion_tokens': 1000}

We can see that max_completion_tokens is forwarded to the model by litellm but:

Azure API only supports max_tokens documentation
Same for Mistral API: documentation

When switching to the same model but via mistral directly (mistral/mistral-large-2512) the following request is sent by litellm:

08:36:16 - LiteLLM:WARNING: litellm_logging.py:1156 -

POST Request Sent from LiteLLM:
curl -X POST \
https://api.mistral.ai/v1/chat/completions \
-H 'Authorization: Be****Df' -H 'Content-Type: application/json' \
-d '{'model': 'mistral-large-2512', 'messages': [{'role': 'system', 'content': 'Faux contexte'}, {'role': 'user', 'content': 'bonjour'}], 'stream': True, 'max_tokens': 1000}'

We can see that here, max_completion_tokens is correctly translated to max_tokens which isn't the case with the azure_ai provider.

Steps to Reproduce

Have a mistral chat model deployed on Microsoft Foundry
Paste the above request in a terminal (adapt it if necessary of course)
Watch the logs of your LiteLLM instance

Relevant log output

07:41:02 - LiteLLM:WARNING: litellm_logging.py:1156 -

POST Request Sent from LiteLLM:
curl -X POST \
https://{REDACTED}.services.ai.azure.com/models/chat/completions?api-version=2025-04-01-preview \
-H 'REDACTED' -H 'Content-Type: application/json' \
-d '{'model': 'mistral-large-3', 'messages': [{'role': 'system', 'content': 'Faux contexte'}, {'role': 'user', 'content': 'bonjour'}], 'stream': True, 'stream_options': {'include_usage': True}, 'max_completion_tokens': 1000}
[...]
07:41:02 - LiteLLM Router:INFO: router.py:2233 - litellm.acompletion(model=azure_ai/mistral-large-3) Exception litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209)
07:41:02 - LiteLLM Router:DEBUG: router.py:5332 - TracebackTraceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 176, in _make_common_async_call
    response = await async_httpx_client.post(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/logging_utils.py", line 297, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post
    response.raise_for_status()
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 829, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '422 Unprocessable Entity' for url 'https://REDACTED.services.ai.azure.com/models/chat/completions?api-version=2025-04-01-preview'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 620, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 694, in acompletion_stream_function
    completion_stream, _response_headers = await self.make_async_call_stream_helper(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<15 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 753, in make_async_call_stream_helper
    response = await self._make_common_async_call(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 195, in _make_common_async_call
    provider_config.transform_request_on_unprocessable_entity_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        e=e, request_data=data
        ^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/azure_ai/chat/transformation.py", line 315, in transform_request_on_unprocessable_entity_error
    data = drop_params_from_unprocessable_entity_error(e=e, data=request_data)
  File "/usr/lib/python3.13/site-packages/litellm/llms/openai/common_utils.py", line 98, in drop_params_from_unprocessable_entity_error
    error_json = e.response.json()
  File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 832, in json
    return jsonlib.loads(self.content, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/json/__init__.py", line 352, in loads
    return _default_decoder.decode(s)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/usr/lib/python3.13/json/decoder.py", line 348, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 209)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5604, in async_function_with_fallbacks
    response = await self.async_function_with_retries(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5759, in async_function_with_retries
    self.should_retry_this_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        error=e,
        ^^^^^^^^
    ...<4 lines>...
        content_policy_fallbacks=content_policy_fallbacks,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5989, in should_retry_this_error
    raise error
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5710, in async_function_with_retries
    response = await self.make_call(original_function, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5878, in make_call
    response = await response
               ^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2241, in _acompletion
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2187, in _acompletion
    response = await _response
               ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 2093, in wrapper_async
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1892, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 639, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2461, in exception_type
    raise e  # it's already mapped
    ^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 635, in exception_type
    raise APIConnectionError(
    ...<7 lines>...
    )
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209)

07:41:02 - LiteLLM Router:INFO: router.py:5416 - Trying to fallback b/w models
07:41:02 - LiteLLM Proxy:ERROR: common_request_processing.py:1570 - litellm.proxy.proxy_server._handle_llm_api_exception(): Exception occured - litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3
Available Model Group Fallbacks=None
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 176, in _make_common_async_call
    response = await async_httpx_client.post(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/logging_utils.py", line 297, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 513, in post
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 469, in post
    response.raise_for_status()
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 829, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '422 Unprocessable Entity' for url 'https://REDACTED.services.ai.azure.com/models/chat/completions?api-version=2025-04-01-preview'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 620, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 694, in acompletion_stream_function
    completion_stream, _response_headers = await self.make_async_call_stream_helper(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<15 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 753, in make_async_call_stream_helper
    response = await self._make_common_async_call(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 195, in _make_common_async_call
    provider_config.transform_request_on_unprocessable_entity_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        e=e, request_data=data
        ^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/llms/azure_ai/chat/transformation.py", line 315, in transform_request_on_unprocessable_entity_error
    data = drop_params_from_unprocessable_entity_error(e=e, data=request_data)
  File "/usr/lib/python3.13/site-packages/litellm/llms/openai/common_utils.py", line 98, in drop_params_from_unprocessable_entity_error
    error_json = e.response.json()
  File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 832, in json
    return jsonlib.loads(self.content, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/json/__init__.py", line 352, in loads
    return _default_decoder.decode(s)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/usr/lib/python3.13/json/decoder.py", line 348, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 209)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 7145, in chat_completion
    result = await base_llm_response_processor.base_process_llm_request(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<16 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/proxy/common_request_processing.py", line 1065, in base_process_llm_request
    responses = await llm_responses
                ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1741, in acompletion
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1717, in acompletion
    response = await self.async_function_with_fallbacks(**kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5613, in async_function_with_fallbacks
    return await self.async_function_with_fallbacks_common_utils(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<8 lines>...
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5570, in async_function_with_fallbacks_common_utils
    raise original_exception
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5604, in async_function_with_fallbacks
    response = await self.async_function_with_retries(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5759, in async_function_with_retries
    self.should_retry_this_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        error=e,
        ^^^^^^^^
    ...<4 lines>...
        content_policy_fallbacks=content_policy_fallbacks,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5989, in should_retry_this_error
    raise error
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5710, in async_function_with_retries
    response = await self.make_call(original_function, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 5878, in make_call
    response = await response
               ^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2241, in _acompletion
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2187, in _acompletion
    response = await _response
               ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 2093, in wrapper_async
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1892, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/main.py", line 639, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2461, in exception_type
    raise e  # it's already mapped
    ^^^^^^^
  File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 635, in exception_type
    raise APIConnectionError(
    ...<7 lines>...
    )
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3
Available Model Group Fallbacks=None
07:41:02 - LiteLLM Proxy:DEBUG: db_spend_update_writer.py:116 - Enters prisma db call, response_cost: 0.0, token: 88dc28d0f030c55ed4ab77ed8faf098196cb1c05df778539800c9f1243fe6b4b; user_id: default_user_id; team_id: None
07:41:02 - LiteLLM Proxy:DEBUG: spend_tracking_utils.py:116 - getting payload for SpendLogs, available keys in metadata: ['user_api_key_user_id', 'headers', 'requester_metadata', 'user_api_key_hash', 'user_api_key_alias', 'user_api_key_spend', 'user_api_key_max_budget', 'user_api_key_team_id', 'user_api_key_project_id', 'user_api_key_project_alias', 'user_api_key_org_id', 'user_api_key_org_alias', 'user_api_key_team_alias', 'user_api_key_end_user_id', 'user_api_key_user_email', 'user_api_key_request_route', 'user_api_key_budget_reset_at', 'user_api_key_auth_metadata', 'user_api_key', 'agent_id', 'user_api_end_user_max_budget', 'user_api_key_auth', 'litellm_api_version', 'global_max_parallel_requests', 'user_api_key_team_max_budget', 'user_api_key_team_spend', 'user_api_key_model_max_budget', 'user_api_key_end_user_model_max_budget', 'user_api_key_user_spend', 'user_api_key_user_max_budget', 'user_api_key_metadata', 'user_api_key_team_metadata', 'user_api_key_object_permission_id', 'user_api_key_team_object_permission_id', 'endpoint', 'litellm_parent_otel_span', 'requester_ip_address', 'user_agent', 'queue_time_seconds', 'model_group', 'model_group_alias', 'model_group_size', 'attempted_retries', 'max_retries', 'deployment', 'model_info', 'api_base', 'deployment_model_name', 'caching_groups', 'status', 'error_information']
07:41:02 - LiteLLM Proxy:DEBUG: spend_tracking_utils.py:480 - SpendTable: created payload - request_id: 42af16c0-9c64-4423-8de5-6cd6cc6e02f2, model: azure_ai/mistral-large-3, spend: 0
07:41:02 - LiteLLM Proxy:DEBUG: db_spend_update_writer.py:723 - Writing spend log to db - request_id: 42af16c0-9c64-4423-8de5-6cd6cc6e02f2, spend: 0.0
07:41:02 - LiteLLM Proxy:DEBUG: db_spend_update_writer.py:190 - Runs spend update on all tables
07:41:02 - LiteLLM Proxy:DEBUG: common_request_processing.py:1583 - An error occurred: litellm.APIConnectionError: APIConnectionError: Azure_aiException - Extra data: line 2 column 1 (char 209). Received Model Group=mistral-large-3
Available Model Group Fallbacks=None
Model: azure_ai/mistral-large-3
API Base: `https://REDACTED.services.ai.azure.com/`
Messages: `[{'role': 'system', 'content': [{'type': 'text', 'text': 'Faux contexte'}]}, {'role': 'user', 'conte`
model_group: `mistral-large-3`

deployment: `azure_ai/mistral-large-3`


 Debug this by setting `--debug`, e.g. `litellm --model gpt-3.5-turbo --debug`
INFO:     172.19.0.1:60786 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.3, v1.81.0, v1.83.7

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The issue can be resolved by modifying the azure_ai provider to correctly translate max_completion_tokens to max_tokens in the API request.

Guidance

The error occurs because the azure_ai provider is not correctly translating max_completion_tokens to max_tokens in the API request, resulting in a 422 Unprocessable Entity error.
To fix this, the azure_ai provider needs to be modified to correctly handle the max_completion_tokens parameter.
The mistral API and Azure API both support max_tokens, so the azure_ai provider should be updated to use this parameter instead of max_completion_tokens.
The issue can be verified by checking the API request sent by the azure_ai provider and ensuring that it includes the correct max_tokens parameter.

Example

No code example is provided as the issue is related to the internal implementation of the azure_ai provider.

Notes

The issue is specific to the azure_ai provider and the mistral-large-3 model, and may not affect other providers or models.

Recommendation

Apply a workaround by modifying the azure_ai provider to correctly translate max_completion_tokens to max_tokens in the API request, as this is the most direct way to resolve the issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #runtime error #dependency conflict #environment setup #docker error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.