litellm - ✅(Solved) Fix [Bug]: There is a bug in least_busy.py that causes traffic to some interfaces to be suppressed to zero. [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#25323Fetched 2026-04-09 07:52:50
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×2labeled ×2referenced ×2

Fix Action

Fixed

PR fix notes

PR #25325: fix(router): clamp least_busy request counter to prevent negative drift

Description (problem / solution / changelog)

Summary

Fixes the least_busy routing strategy silently suppressing traffic to certain deployments by clamping the per-deployment request counter to prevent negative values.

Root Cause

The request counter is incremented in log_pre_api_call and decremented in success/failure callbacks. Under race conditions (callback fires before pre-call, or fires twice), the counter goes negative.

A negative count is always less than the 0 assigned to fresh/unused deployments in _get_available_deployments, so the negative-count deployment wins every comparison and attracts ALL traffic, while others gradually starve to zero.

Fix

Added max(value - 1, 0) on all 4 decrement paths:

  • log_success_event (sync)
  • log_failure_event (sync)
  • async_log_success_event
  • async_log_failure_event

This ensures the counter never goes below 0, preventing any single deployment from monopolizing traffic.

Testing

The fix is a defensive floor guard on an integer counter. Existing routing tests validate the least_busy selection logic.

Disclaimer

AI agents (Claude Code) assisted with this contribution.

Fixes #25323

Changed files

  • litellm/router_strategy/least_busy.py (modified, +4/-4)
  • tests/test_litellm/test_least_busy_counter_clamp.py (added, +140/-0)
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

There are four model services under my LiteLLM instance, but traffic to certain endpoints gradually drops to zero and eventually receives almost no further requests.

Steps to Reproduce

def _get_available_deployments( self, healthy_deployments: list, all_deployments: dict, ): """ Helper to get deployments using least busy strategy """ for d in healthy_deployments: ## if healthy deployment not yet used if d["model_info"]["id"] not in all_deployments: all_deployments[d["model_info"]["id"]] = 0 # map deployment to id # pick least busy deployment, with random jitter on ties min_traffic = float("inf") for k, v in all_deployments.items(): if v < min_traffic: min_traffic = v # collect all deployments tied at the minimum min_deployment_ids = [k for k, v in all_deployments.items() if v == min_traffic] if min_deployment_ids: chosen_id = random.choice(min_deployment_ids) for m in healthy_deployments: if m["model_info"]["id"] == chosen_id: return m # chosen_id not in healthy list, fall back return random.choice(healthy_deployments) else: return random.choice(healthy_deployments)

Relevant log output

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

1.82.6

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The issue with traffic dropping to zero for certain endpoints may be related to the least busy strategy in the _get_available_deployments method, which could be causing an uneven distribution of requests.

Guidance

  • Review the _get_available_deployments method to ensure it's correctly implementing the least busy strategy, considering the potential for deployments to be tied at the minimum traffic level.
  • Verify that the all_deployments dictionary is being updated correctly, as the issue may be related to the way deployments are being mapped to their respective traffic levels.
  • Check if the healthy_deployments list is being populated correctly, as the method relies on this list to determine the available deployments.
  • Consider adding logging or monitoring to track the traffic levels for each deployment, to better understand how the traffic is being distributed.

Example

# Example of how to add logging to track traffic levels
import logging

def _get_available_deployments(
        self,
        healthy_deployments: list,
        all_deployments: dict,
    ):
    logging.info(f"Healthy deployments: {healthy_deployments}")
    logging.info(f"All deployments: {all_deployments}")
    # ... rest of the method ...

Notes

The provided code snippet seems to be a part of the LiteLLM SDK, and the issue may be related to the specific implementation of the least busy strategy. Without more information about the traffic patterns and the deployment configurations, it's difficult to provide a more specific solution.

Recommendation

Apply workaround: Modify the _get_available_deployments method to include additional logging and monitoring to track the traffic levels for each deployment, to better understand how the traffic is being distributed. This will help identify the root cause of the issue and provide more insight into the behavior of the least busy strategy.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING