vllm - 💡(How to fix) Fix [Bug]: --max-logprobs and --long-prefill-token-threshold silently accept negative values (config-validation gap) [2 pull requests]

vllm2026-05-29 13:46:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

ValueError: max_logprobs must be a non-negative integer or -1 (auto), got -5. ValueError: long_prefill_token_threshold must be >= 0 (0 = off, > 0 = clamp), got -5.

Fix Action

Fixed

Fixed by PR: [docs] reference filed vllm-project/vllm#43985 (Findings #5/#6) + correct scheduler/sampling line refs (https://github.com/lucasccordeiro/vllm/pull/40)
Fixed by PR: [Bugfix] Reject negative values for max_logprobs and long_prefill_token_threshold (https://github.com/vllm-project/vllm/pull/44002)

Code Example

if 0 < self.scheduler_config.long_prefill_token_threshold < num_new_tokens:
      num_new_tokens = self.scheduler_config.long_prefill_token_threshold

---

import dataclasses
  from vllm.config.model import ModelConfig
  from vllm.config.scheduler import SchedulerConfig

  ml = next(f for f in dataclasses.fields(ModelConfig) if f.name == "max_logprobs")
  assert dict(ml.metadata) == {}                      # no constraint

  lp = next(f for f in dataclasses.fields(SchedulerConfig)
            if f.name == "long_prefill_token_threshold")
  assert dict(lp.metadata) == {}                      # no constraint

  sc = SchedulerConfig(long_prefill_token_threshold=-5,
                       max_model_len=4096, is_encoder_decoder=False)
  assert sc.long_prefill_token_threshold == -5        # stored verbatim, no error

  mc = ModelConfig(model="facebook/opt-125m", max_logprobs=-5)
  assert mc.max_logprobs == -5                         # stored verbatim, no error

---

ModelConfig.max_logprobs: type=int, default=20, metadata={}
  SchedulerConfig.long_prefill_token_threshold: type=int, default=0, metadata={}
  SchedulerConfig(long_prefill_token_threshold=-5) -> -5   # accepted silently
  INFO ... [model.py:1752] Using max model len 2048
  ModelConfig(max_logprobs=-5) -> -5                       # accepted silently

---

# max_logprobs path — assert the post-sentinel cap is non-negative (effective >= 0)
  max_logprobs = -2, vocab_size = 1073741824   ->   effective = -2        # FAILED

  # long_prefill path — assert the user's cap actually clamps (num_new_tokens < original)
  threshold = -1, num_new_tokens = 1073741824  ->   num_new_tokens = 1073741824  # unchanged, FAILED

---

def validate_logprobs(cap, user_logprobs, vocab_size=32000):
      if cap == -1:
          cap = vocab_size
      if num := user_logprobs:
          if num == -1:
              num = vocab_size
          if num > cap:
              raise ValueError(f"Requested sample logprobs of {num}, "
                               f"which is greater than max allowed: {cap}")

  try:
      validate_logprobs(cap=-5, user_logprobs=3)
  except ValueError as e:
      assert "max allowed: -5" in str(e)              # A: confusing message
  validate_logprobs(cap=-5, user_logprobs=None)       # B: silent no-op

---

threshold, num_new_tokens = -5, 1024
  original = num_new_tokens
  if 0 < threshold < num_new_tokens:                  # 0 < -5 is False
      num_new_tokens = threshold
  assert num_new_tokens == original                   # cap silently ignored

---

ValueError: max_logprobs must be a non-negative integer or -1 (auto), got -5.
  ValueError: long_prefill_token_threshold must be >= 0 (0 = off, > 0 = clamp), got -5.

---

# vllm/config/model.py
  @field_validator("max_logprobs", mode="after")
  @classmethod
  def _check_max_logprobs(cls, v):
      if v == -1 or v >= 0:
          return v
      raise ValueError(
          f"max_logprobs must be a non-negative integer or -1 "
          f"(auto-derive to vocab size), got {v}."
      )

  # vllm/config/scheduler.py
  @field_validator("long_prefill_token_threshold", mode="after")
  @classmethod
  def _check_long_prefill_token_threshold(cls, v):
      if v < 0:
          raise ValueError(
              f"long_prefill_token_threshold must be >= 0 "
              f"(0 = off, > 0 = clamp), got {v}."
          )
      return v

RAW_BUFFERClick to expand / collapse

Your current environment

OS: macOS 26.5 (arm64)
Python: 3.12.11
PyTorch: 2.11.0
vLLM: 0.1.dev1+g4438b6e7d
Transformers: 5.9.0

Built from source via VLLM_TARGET_DEVICE=empty pip install -e . at commit 4438b6e7d.

🐛 Describe the bug

Two CLI-settable integer parameters accept negative values that no validator rejects. Neither causes a crash or silent corruption, so this is low severity — but in both cases the malformed flag is silently ineffective (or surfaces a confusing error), with no signal to the user. Both are the same field-level admission shape as the batch tightened in #43794 (#43496 / #43521 / #43532): a CLI int field with no Field(gt=0) / ge= constraint whose downstream logic only special-cases specific values.

1. --max-logprobs <negative> — ModelConfig.max_logprobs is declared int = 20 (vllm/config/model.py:234) with no constraint. In _validate_logprobs (vllm/sampling_params.py:680) the sentinel rewrite only handles == -1 (auto = vocab size); every other negative survives unchanged:

logprob-requesting traffic: the request is rejected, but the error message exposes the malformed cap to the user — "Requested sample logprobs of 3, which is greater than max allowed: -5";
logprob-free traffic (self.logprobs is None/0): the validator is skipped entirely, so --max-logprobs -5 is a pure no-op.

2. --long-prefill-token-threshold <negative> — SchedulerConfig.long_prefill_token_threshold is declared int = 0 (vllm/config/scheduler.py:80) with no constraint. __post_init__ only rewrites the == 0 case, and the scheduler clamp (vllm/v1/core/sched/scheduler.py:390) is guarded by 0 < threshold:

if 0 < self.scheduler_config.long_prefill_token_threshold < num_new_tokens:
    num_new_tokens = self.scheduler_config.long_prefill_token_threshold

For any negative threshold the 0 < threshold conjunct is False, so the clamp never fires. The user-set cap has zero effect on scheduling — semantically identical to the 0 = "off" sentinel, but the user's input was not 0 and nothing signals that the flag did nothing. (The existing sanity check at scheduler.py line ~295 only rejects the too-large case threshold > max_model_len.)

Reproduction

Both fields are accepted verbatim at config-construction time (no Pydantic constraint), confirmed live against a source build of 4438b6e7d:

import dataclasses
from vllm.config.model import ModelConfig
from vllm.config.scheduler import SchedulerConfig

ml = next(f for f in dataclasses.fields(ModelConfig) if f.name == "max_logprobs")
assert dict(ml.metadata) == {}                      # no constraint

lp = next(f for f in dataclasses.fields(SchedulerConfig)
          if f.name == "long_prefill_token_threshold")
assert dict(lp.metadata) == {}                      # no constraint

sc = SchedulerConfig(long_prefill_token_threshold=-5,
                     max_model_len=4096, is_encoder_decoder=False)
assert sc.long_prefill_token_threshold == -5        # stored verbatim, no error

mc = ModelConfig(model="facebook/opt-125m", max_logprobs=-5)
assert mc.max_logprobs == -5                         # stored verbatim, no error

Actual output (vLLM 0.1.dev1+g4438b6e7d, torch 2.11.0, Python 3.12.11):

ModelConfig.max_logprobs: type=int, default=20, metadata={}
SchedulerConfig.long_prefill_token_threshold: type=int, default=0, metadata={}
SchedulerConfig(long_prefill_token_threshold=-5) -> -5   # accepted silently
INFO ... [model.py:1752] Using max model len 2048
ModelConfig(max_logprobs=-5) -> -5                       # accepted silently

How this was found — ESBMC-Python counterexamples

These were surfaced by an ESBMC-Python formal-verification harness that symbolically explores vLLM's config-to-arithmetic chain over every CLI-accepted input (PoC repo). Each harness asserts the engine's implicit precondition; ESBMC returns a concrete counterexample (VERIFICATION FAILED):

# max_logprobs path — assert the post-sentinel cap is non-negative (effective >= 0)
max_logprobs = -2, vocab_size = 1073741824   ->   effective = -2        # FAILED

# long_prefill path — assert the user's cap actually clamps (num_new_tokens < original)
threshold = -1, num_new_tokens = 1073741824  ->   num_new_tokens = 1073741824  # unchanged, FAILED

The sandbox reproductions above are these same counterexamples reproduced end-to-end on a real vLLM build.

Downstream behaviour (--max-logprobs -5):

def validate_logprobs(cap, user_logprobs, vocab_size=32000):
    if cap == -1:
        cap = vocab_size
    if num := user_logprobs:
        if num == -1:
            num = vocab_size
        if num > cap:
            raise ValueError(f"Requested sample logprobs of {num}, "
                             f"which is greater than max allowed: {cap}")

try:
    validate_logprobs(cap=-5, user_logprobs=3)
except ValueError as e:
    assert "max allowed: -5" in str(e)              # A: confusing message
validate_logprobs(cap=-5, user_logprobs=None)       # B: silent no-op

Downstream behaviour (--long-prefill-token-threshold -5):

threshold, num_new_tokens = -5, 1024
original = num_new_tokens
if 0 < threshold < num_new_tokens:                  # 0 < -5 is False
    num_new_tokens = threshold
assert num_new_tokens == original                   # cap silently ignored

Expected

A clean rejection at config-construction time for both, e.g.:

ValueError: max_logprobs must be a non-negative integer or -1 (auto), got -5.
ValueError: long_prefill_token_threshold must be >= 0 (0 = off, > 0 = clamp), got -5.

Why the validation chain misses negatives

Neither field carries a Field constraint, and the downstream logic only special-cases particular values — max_logprobs rewrites only -1, and the long_prefill_token_threshold clamp/sanity-check are guarded by 0 < … / > max_model_len. Negatives match neither the rewrite nor the existing guards, so they pass through unvalidated. As with #43532, a field-level ge=/gt= constraint cannot express "non-negative or exactly -1", so a small field_validator is the natural fix.

Proposed fix

Mirror the field_validator pattern landed in #43794:

# vllm/config/model.py
@field_validator("max_logprobs", mode="after")
@classmethod
def _check_max_logprobs(cls, v):
    if v == -1 or v >= 0:
        return v
    raise ValueError(
        f"max_logprobs must be a non-negative integer or -1 "
        f"(auto-derive to vocab size), got {v}."
    )

# vllm/config/scheduler.py
@field_validator("long_prefill_token_threshold", mode="after")
@classmethod
def _check_long_prefill_token_threshold(cls, v):
    if v < 0:
        raise ValueError(
            f"long_prefill_token_threshold must be >= 0 "
            f"(0 = off, > 0 = clamp), got {v}."
        )
    return v

These were surfaced by the same ESBMC-Python SkipValidation/unconstrained-int audit that produced #43496 / #43521 / #43532 / #43842; happy to open a PR if the shape is acceptable.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering