litellm - ✅(Solved) Fix [Bug]: `_set_model_group_info` fallback ignores DB `model_info` cost [1 pull requests, 1 participants]

litellm2026-04-16 16:55:57

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#25874•Fetched 2026-04-17 08:28:25

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Zerohertz

Participants

Zerohertz

Timeline (top)

labeled ×2cross-referenced ×1renamed ×1

Root Cause

In litellm/router.py, _set_model_group_info() (line ~7650):

get_deployment_model_info(model_id, model_name) looks up the model in LiteLLM's built-in cost map (model_prices_and_context_window.json).
For custom/self-hosted models (e.g., hosted_vllm/Qwen/Qwen3-8B), it's not in the built-in map → returns None.
The fallback (line ~7684) creates ModelMapInfo with hardcoded input_cost_per_token=0, output_cost_per_token=0 — ignoring the DB model_info values that are already loaded in model_info_dict (line ~7629).
_is_model_cost_zero() in auth_checks.py sees cost=0 → returns True → skip_budget_checks = True.
_virtual_key_max_budget_check is never called.

The DB model_info cost is used for spend calculation (after the request) but not for the zero-cost check (before the request). This creates an inconsistency where spend accumulates correctly but budget enforcement never triggers.

# router.py line ~7629
model_info_dict = model.get("model_info", {})  # DB values available here (e.g., 0.5)

# router.py line ~7684 — fallback ignores model_info_dict
if model_info is None:
    model_info = ModelMapInfo(
        input_cost_per_token=0,   # Should be: model_info_dict.get("input_cost_per_token", 0)
        output_cost_per_token=0,  # Should be: model_info_dict.get("output_cost_per_token", 0)
        ...
    )

Fix Action

Fixed

Fixed by PR: fix(router): propagate custom cost_per_token from db model_info in fallback path (https://github.com/BerriAI/litellm/pull/25888)

PR fix notes

PR #25888: fix(router): propagate `custom cost_per_token` from db `model_info` in fallback path

Repository: BerriAI/litellm
Author: Zerohertz
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/25888

Description (problem / solution / changelog)

Relevant issues

Fixes #25874

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

<table> <tr> <th>Before</th> <th>After</th> </tr> <tr> <td><img width="800" alt="Before fix - cost_per_token is None" src="https://github.com/user-attachments/assets/162771f0-f124-4f71-8797-c7e1fa4ec64f" /></td> <td><img width="800" alt="After fix - cost_per_token propagated from db model_info" src="https://github.com/user-attachments/assets/68f6cf44-90c9-42f1-a923-5c7efef49799" /></td> </tr> </table>

Unit tests — 2 new tests added and passing:

test_model_group_info_cost_from_db_model_info — verifies input_cost_per_token / output_cost_per_token are read from db_model_info when get_deployment_model_info fails
test_model_group_info_cost_none_when_db_model_info_has_no_cost — verifies values remain None when db_model_info has no cost fields

tests/test_litellm/test_router.py::test_model_group_info_cost_from_db_model_info PASSED
tests/test_litellm/test_router.py::test_model_group_info_cost_none_when_db_model_info_has_no_cost PASSED

Type

🐛 Bug Fix

Changes

When a model is not found in LiteLLM's built-in model cost map (model_info is None in get_model_group_info()), the fallback ModelMapInfo was hardcoding input_cost_per_token and output_cost_per_token to None, ignoring any custom cost values the user had configured in the database's model_info.

This fix reads input_cost_per_token and output_cost_per_token from db_model_info (the user's database/config model_info), consistent with how mode is already read from the same source. If the keys are not present, .get() returns None, preserving backward compatibility.

Before

model_info = ModelMapInfo(
    ...
    input_cost_per_token=None,
    output_cost_per_token=None,
    ...
)

After

input_cost_per_token = db_model_info.get("input_cost_per_token")
output_cost_per_token = db_model_info.get("output_cost_per_token")

model_info = ModelMapInfo(
    ...
    input_cost_per_token=input_cost_per_token,
    output_cost_per_token=output_cost_per_token,
    ...
)

Changed files

litellm/router.py (modified, +4/-2)
tests/test_litellm/test_router.py (modified, +63/-0)

Code Example

# router.py line ~7629
model_info_dict = model.get("model_info", {})  # DB values available here (e.g., 0.5)

# router.py line ~7684 — fallback ignores model_info_dict
if model_info is None:
    model_info = ModelMapInfo(
        input_cost_per_token=0,   # Should be: model_info_dict.get("input_cost_per_token", 0)
        output_cost_per_token=0,  # Should be: model_info_dict.get("output_cost_per_token", 0)
        ...
    )

---

-                input_cost_per_token=0,
-                output_cost_per_token=0,
+                input_cost_per_token=db_model_info.get("input_cost_per_token", 0),
+                output_cost_per_token=db_model_info.get("output_cost_per_token", 0),

---

curl -X POST "${LITELLM_URL}/model/new" \
  -H "Authorization: Bearer ${MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "my-custom-model",
    "litellm_params": {
      "model": "hosted_vllm/Qwen/Qwen3-8B",
      "api_base": "http://my-vllm-server/v1",
      "api_key": "dummy"
    },
    "model_info": {
      "input_cost_per_token": 0.5,
      "output_cost_per_token": 0.5
    }
  }'

---

curl -X POST "${LITELLM_URL}/key/generate" \
  -H "Authorization: Bearer ${MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"max_budget": 0.001}'

---

curl "${LITELLM_URL}/key/info?key=sk-..." -H "Authorization: Bearer ${MASTER_KEY}"
# Shows: spend=10.0, max_budget=0.001

---

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When using custom-hosted models (e.g., hosted_vllm) with custom pricing set via model_info.input_cost_per_token / output_cost_per_token (either in config.yaml or via /model/new API), the budget check (_virtual_key_max_budget_check) is completely skipped because _is_model_cost_zero incorrectly returns True.

Expected behavior: Budget enforcement should work for custom-priced models. A virtual key with max_budget=5 and spend=10 should be blocked.

Actual behavior: Requests always succeed (200) regardless of budget, because the model is treated as "zero cost".

Root Cause

In litellm/router.py, _set_model_group_info() (line ~7650):

get_deployment_model_info(model_id, model_name) looks up the model in LiteLLM's built-in cost map (model_prices_and_context_window.json).
For custom/self-hosted models (e.g., hosted_vllm/Qwen/Qwen3-8B), it's not in the built-in map → returns None.
The fallback (line ~7684) creates ModelMapInfo with hardcoded input_cost_per_token=0, output_cost_per_token=0 — ignoring the DB model_info values that are already loaded in model_info_dict (line ~7629).
_is_model_cost_zero() in auth_checks.py sees cost=0 → returns True → skip_budget_checks = True.
_virtual_key_max_budget_check is never called.

# router.py line ~7629
model_info_dict = model.get("model_info", {})  # DB values available here (e.g., 0.5)

# router.py line ~7684 — fallback ignores model_info_dict
if model_info is None:
    model_info = ModelMapInfo(
        input_cost_per_token=0,   # Should be: model_info_dict.get("input_cost_per_token", 0)
        output_cost_per_token=0,  # Should be: model_info_dict.get("output_cost_per_token", 0)
        ...
    )

Suggested Fix

-                input_cost_per_token=0,
-                output_cost_per_token=0,
+                input_cost_per_token=db_model_info.get("input_cost_per_token", 0),
+                output_cost_per_token=db_model_info.get("output_cost_per_token", 0),

db_model_info is already defined at line ~7681 as model.get("model_info", {}).

Steps to Reproduce

curl -X POST "${LITELLM_URL}/model/new" \
  -H "Authorization: Bearer ${MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "my-custom-model",
    "litellm_params": {
      "model": "hosted_vllm/Qwen/Qwen3-8B",
      "api_base": "http://my-vllm-server/v1",
      "api_key": "dummy"
    },
    "model_info": {
      "input_cost_per_token": 0.5,
      "output_cost_per_token": 0.5
    }
  }'

Create a virtual key with a small budget:

curl -X POST "${LITELLM_URL}/key/generate" \
  -H "Authorization: Bearer ${MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"max_budget": 0.001}'

Send requests using the key — spend accumulates correctly (e.g., $10 per request).
Verify budget is exceeded:

curl "${LITELLM_URL}/key/info?key=sk-..." -H "Authorization: Bearer ${MASTER_KEY}"
# Shows: spend=10.0, max_budget=0.001

Send another request — expected 400 BudgetExceededError, got 200.
Check logs — _is_model_cost_zero=True is logged, confirming budget check was skipped.

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

latest

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The most likely fix is to update the _set_model_group_info() function in litellm/router.py to use the model_info values from the database instead of hardcoded zeros for custom-hosted models.

Guidance

The issue arises from the inconsistency in using database model_info values for spend calculation but not for the zero-cost check.
To fix this, update the fallback in _set_model_group_info() to use model_info_dict values for input_cost_per_token and output_cost_per_token.
Verify the fix by checking if budget enforcement works correctly for custom-priced models.
Test the fix using the provided steps to reproduce the issue.

Example

# Updated fallback in router.py
if model_info is None:
    model_info = ModelMapInfo(
        input_cost_per_token=model_info_dict.get("input_cost_per_token", 0),
        output_cost_per_token=model_info_dict.get("output_cost_per_token", 0),
        ...
    )

Notes

This fix assumes that the model_info values in the database are correctly set for custom-hosted models. If these values are not set or are incorrect, the fix may not work as expected.

Recommendation

Apply the suggested fix to update the _set_model_group_info() function, as it directly addresses the root cause of the issue and ensures consistency in using database model_info values for both spend calculation and zero-cost checks.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #model compatibility #GPU setup #container setup #orchestration issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: `_set_model_group_info` fallback ignores DB `model_info` cost [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #25888: fix(router): propagate custom cost_per_token from db model_info in fallback path

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Delays in PR merge?

CI (LiteLLM team)

Screenshots / Proof of Fix

Type

Changes

Before

After

Changed files

Code Example

Check for existing issues

What happened?

Root Cause

Suggested Fix

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

PR #25888: fix(router): propagate `custom cost_per_token` from db `model_info` in fallback path