litellm - 💡(How to fix) Fix [Bug]: Health check fails for vLLM classify models — always reports unhealthy [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#24265Fetched 2026-04-08 01:08:59
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
commented ×1mentioned ×1subscribed ×1

When a vLLM classification model is registered in model_list using the hosted_vllm/ prefix, the LiteLLM health check always reports it as unhealthy, even though the model works correctly in production via the /vllm/classify endpoint.

The health check sends a request to /chat/completions but classification models only expose a /classify endpoint — causing a 404 Not Found error that incorrectly marks the deployment as unhealthy.

Error Message

litellm.NotFoundError: Hosted_vllmException - {"detail":"Not Found"} httpx.HTTPStatusError: Client error '404 Not Found' for url 'http://localhost:8095/chat/completions'

Root Cause

When a vLLM classification model is registered in model_list using the hosted_vllm/ prefix, the LiteLLM health check always reports it as unhealthy, even though the model works correctly in production via the /vllm/classify endpoint.

The health check sends a request to /chat/completions but classification models only expose a /classify endpoint — causing a 404 Not Found error that incorrectly marks the deployment as unhealthy.

Fix Action

Workaround

Setting disable_health_check: true in general_settings does not resolve the issue in v1.82.3.

The model works correctly in production — this is a false negative in the health check only.

Code Example

model_list:
  - model_name: model/xlm-roberta-base-mention-classifier
    litellm_params:
      model: hosted_vllm/model/xlm-roberta-base-mention-classifier
      api_base: http://localhost:8095

---

litellm.NotFoundError: Hosted_vllmException - {"detail":"Not Found"}
httpx.HTTPStatusError: Client error '404 Not Found' for url 'http://localhost:8095/chat/completions'
RAW_BUFFERClick to expand / collapse

Description

When a vLLM classification model is registered in model_list using the hosted_vllm/ prefix, the LiteLLM health check always reports it as unhealthy, even though the model works correctly in production via the /vllm/classify endpoint.

The health check sends a request to /chat/completions but classification models only expose a /classify endpoint — causing a 404 Not Found error that incorrectly marks the deployment as unhealthy.

LiteLLM Version

v1.82.3-stable

Config

model_list:
  - model_name: model/xlm-roberta-base-mention-classifier
    litellm_params:
      model: hosted_vllm/model/xlm-roberta-base-mention-classifier
      api_base: http://localhost:8095

Steps to Reproduce

  1. Register a vLLM classification model in model_list (e.g. XLM-RoBERTa)
  2. Call GET /health — the model appears in unhealthy_endpoints
  3. Call POST /vllm/classify with the same model — it works correctly

Expected Behavior

Health check should detect that the model's mode is classify and use the /classify endpoint instead of /chat/completions, OR skip the health check for classify models entirely.

Actual Behavior

Health check always calls /chat/completions, gets a 404 Not Found from vLLM, and marks the model as unhealthy:

litellm.NotFoundError: Hosted_vllmException - {"detail":"Not Found"}
httpx.HTTPStatusError: Client error '404 Not Found' for url 'http://localhost:8095/chat/completions'

Workaround

Setting disable_health_check: true in general_settings does not resolve the issue in v1.82.3.

The model works correctly in production — this is a false negative in the health check only.

Related Issues

  • #11205 — Feature request for vLLM classify endpoint support (implemented in production via /vllm/classify, but health check not updated accordingly)

extent analysis

Fix Plan

To resolve the issue, we need to update the health check to use the correct endpoint for classification models. We can achieve this by adding a conditional check to determine the model type and use the corresponding endpoint.

Step-by-Step Solution

  • Update the health check code to include a conditional statement that checks the model type.
  • If the model is a classification model, use the /classify endpoint instead of /chat/completions.

Example code snippet:

if model_type == "classification":
    health_check_url = f"{api_base}/vllm/classify"
else:
    health_check_url = f"{api_base}/chat/completions"
  • Update the model_list configuration to include the model type.
model_list:
  - model_name: model/xlm-roberta-base-mention-classifier
    model_type: classification
    litellm_params:
      model: hosted_vllm/model/xlm-roberta-base-mention-classifier
      api_base: http://localhost:8095

Verification

To verify that the fix worked, call the GET /health endpoint and check that the model is no longer reported as unhealthy. Additionally, test the POST /vllm/classify endpoint to ensure that it still works correctly.

Extra Tips

  • Make sure to update the documentation to reflect the changes to the health check and model configuration.
  • Consider adding a feature flag to enable or disable the health check for classification models.
  • Review related issues, such as #11205, to ensure that the fix is consistent with the overall architecture and design of the system.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING