litellm - 💡(How to fix) Fix [Bug]: /health/test_connection ignores model_info — health_check_supports_max_tokens: false never respected [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

In test_model_connection (litellm/proxy/health_endpoints/_health_endpoints.py), _update_litellm_params_for_health_check is called with a hardcoded model_info={} instead of passing the actual model_info from the request. This means health_check_supports_max_tokens: false set in model_info is silently ignored when using the UI "Test Connection" button — max_tokens: 5 is always injected regardless of the setting.

Root Cause

In test_model_connection, model_info is extracted from the request but then discarded when calling the helper:

# model_info is correctly received from the request above...

# Bug: hardcoded empty dict ignores the received model_info
litellm_params = _update_litellm_params_for_health_check(
    model_info={},          # <-- should be: model_info=model_info or {}
    litellm_params=litellm_params,
)

Fix Action

Fixed

Code Example

model_list:
     - model_name: my-model
       litellm_params:
         model: openai/my-deployment
         api_base: https://...
         api_key: ...
         custom_llm_provider: openai
       model_info:
         health_check_supports_max_tokens: false

---

# model_info is correctly received from the request above...

# Bug: hardcoded empty dict ignores the received model_info
litellm_params = _update_litellm_params_for_health_check(
    model_info={},          # <-- should be: model_info=model_info or {}
    litellm_params=litellm_params,
)

---

# Before (bug):
litellm_params = _update_litellm_params_for_health_check(
    model_info={},
    litellm_params=litellm_params,
)

# After (fix):
litellm_params = _update_litellm_params_for_health_check(
    model_info=model_info or {},
    litellm_params=litellm_params,
)
RAW_BUFFERClick to expand / collapse

Description

In test_model_connection (litellm/proxy/health_endpoints/_health_endpoints.py), _update_litellm_params_for_health_check is called with a hardcoded model_info={} instead of passing the actual model_info from the request. This means health_check_supports_max_tokens: false set in model_info is silently ignored when using the UI "Test Connection" button — max_tokens: 5 is always injected regardless of the setting.

Steps to Reproduce

  1. Add a model to config.yaml with model_info: { health_check_supports_max_tokens: false }:
    model_list:
      - model_name: my-model
        litellm_params:
          model: openai/my-deployment
          api_base: https://...
          api_key: ...
          custom_llm_provider: openai
        model_info:
          health_check_supports_max_tokens: false
  2. Start LiteLLM proxy and go to the UI → Models tab
  3. Click "Test Connection" for that model
  4. Observe the outgoing request to the backend — it includes max_tokens: 5

Expected Behavior

max_tokens should not be sent when health_check_supports_max_tokens: false is configured, matching the behavior of the background /health endpoint which correctly respects this flag.

Actual Behavior

max_tokens: 5 is always sent from the UI Test Connection, regardless of the health_check_supports_max_tokens setting. The background health check (/health) works correctly; only the UI Test Connection button is broken.

Root Cause

In test_model_connection, model_info is extracted from the request but then discarded when calling the helper:

# model_info is correctly received from the request above...

# Bug: hardcoded empty dict ignores the received model_info
litellm_params = _update_litellm_params_for_health_check(
    model_info={},          # <-- should be: model_info=model_info or {}
    litellm_params=litellm_params,
)

One-Line Fix

# Before (bug):
litellm_params = _update_litellm_params_for_health_check(
    model_info={},
    litellm_params=litellm_params,
)

# After (fix):
litellm_params = _update_litellm_params_for_health_check(
    model_info=model_info or {},
    litellm_params=litellm_params,
)

Impact

Azure AI Foundry deployments (and similar providers) that only accept max_completion_tokens and reject max_tokens will always fail the UI Test Connection button even when the background health check passes. This is confusing because the model is healthy but the UI reports it as failing.

LiteLLM Version

1.86.2 (bug also present in current main as of 2026-05-29)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [Bug]: /health/test_connection ignores model_info — health_check_supports_max_tokens: false never respected [2 pull requests]