vllm - ✅(Solved) Fix [Bug]: When using the Sonnet dataset for benchmark testing, if the input length is too small, the CPU usage becomes abnormally high with no error logs, making it impossible to run the benchmark properly. [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
vllm-project/vllm#38470Fetched 2026-04-08 01:49:10
View on GitHub
Comments
0
Participants
1
Timeline
9
Reactions
0
Participants
Timeline (top)
cross-referenced ×5referenced ×3labeled ×1

Error Message

When using the Sonnet dataset for benchmark testing, if the input length is around 200 or less, the CPU usage becomes abnormally high with no error logs, and the benchmark cannot be started properly.

Fix Action

Fixed

PR fix notes

PR #38471: Fix potential infinite loop in SonnetDataset.sample

Description (problem / solution / changelog)

Add DEFAULT_MAX_SAMPLE_ATTEMPTS constant to prevent infinite loop when randomly selected poem lines exceed input_len. If generation fails after max attempts, raise RuntimeError with helpful message.

Co-authored-by: Claude

Purpose

Fix bug https://github.com/vllm-project/vllm/issues/38470

Test Plan

Test Result


<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
</details>

Changed files

  • vllm/benchmarks/datasets.py (modified, +11/-1)

PR #38480: Add DEFAULT_MAX_SAMPLE_ATTEMPTS constant to prevent infinite loop when randomly selected poem lines exceed input_len. If generation fails after max attempts, raise RuntimeError with helpful message.

Description (problem / solution / changelog)

Purpose

Fix bug https://github.com/vllm-project/vllm/issues/38470


<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
</details>

Changed files

  • vllm/benchmarks/datasets.py (modified, +11/-1)

PR #38481: Fix potential infinite loop in SonnetDataset.sample when using short input-len

Description (problem / solution / changelog)

Purpose

Add DEFAULT_MAX_SAMPLE_ATTEMPTS constant to prevent infinite loop when randomly selected poem lines exceed input_len. If generation fails after max attempts, raise RuntimeError with helpful message.

Fixes https://github.com/vllm-project/vllm/issues/38470


<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.
</details>

Changed files

  • vllm/benchmarks/datasets.py (modified, +11/-1)

Code Example

Your output of `python collect_env.py` here
RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>
Your output of `python collect_env.py` here
</details>

🐛 Describe the bug

When using the Sonnet dataset for benchmark testing, if the input length is around 200 or less, the CPU usage becomes abnormally high with no error logs, and the benchmark cannot be started properly. such as : vllm bench serve
--backend vllm
--model Qwen/Qwen2-VL-7B-Instruct
--dataset-name sonnet
--dataset-path benchmarks/sonnet_4x.txt
--sonnet-input-len 220
--sonnet-output-len 500

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

Fix Plan

The fix involves optimizing the dataset loading and preprocessing for the Sonnet dataset to reduce CPU usage.

  • Step 1: Optimize Dataset Loading
    • Use a more efficient data loading method, such as using a buffered reader or a library like dask to load the dataset in chunks.
    • Example code:

import dask.dataframe as dd

Load the dataset in chunks

df = dd.read_csv('benchmarks/sonnet_4x.txt', blocksize=25 * 1024 * 1024)

*   **Step 2: Preprocess Dataset**
    *   Preprocess the dataset to reduce the input length and improve performance.
    *   Example code:
    ```python
import torch
from transformers import AutoTokenizer

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained('Qwen/Qwen2-VL-7B-Instruct')

# Preprocess the dataset
def preprocess_text(text):
    inputs = tokenizer(text, return_tensors='pt', max_length=200, truncation=True)
    return inputs

# Apply the preprocessing function to the dataset
df['input_ids'] = df['text'].apply(lambda x: preprocess_text(x)['input_ids'])
  • Step 3: Update the Benchmark Script
    • Update the benchmark script to use the preprocessed dataset and optimized data loading method.
    • Example code:

import torch from vllm import benchmark

Load the preprocessed dataset

df = pd.read_csv('benchmarks/sonnet_4x.txt')

Define the benchmark function

def benchmark_sonnet(model, dataset): # Use the preprocessed dataset and optimized data loading method inputs = dataset['input_ids'] labels = dataset['labels'] # Run the benchmark benchmark.run(model, inputs, labels)

Run the benchmark

benchmark_sonnet(model, df)


### Verification
To verify that the fix worked, run the benchmark script with the updated dataset loading and preprocessing method. Monitor the CPU usage and check if the benchmark can be started properly.

### Extra Tips
*   Use a profiling tool to identify performance bottlenecks in the code.
*   Consider using a more efficient model or optimizing the model architecture for better performance.
*   Refer to the documentation for the `dask` library and the `transformers` library for more information on optimizing dataset loading and preprocessing.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING