vllm - 💡(How to fix) Fix [Feature]: Allow passing `images` to CompletionRequest [3 comments, 2 participants]

vllm2026-03-18 12:20:10

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#37423•Fetched 2026-04-08 00:57:33

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Assignees

Timeline (top)

commented ×3assigned ×1labeled ×1

Root Cause

At the moment it is not possible to evaluate pretrained multi-modal models such as: https://huggingface.co/Qwen/Qwen3.5-9B-Base on tasks that include text as well as images because it is not possible to pass images to the CompletionRequest class:

Code Example

class CompletionRequest(OpenAIBaseModel):
    # Ordered by official OpenAI API documentation
    # https://platform.openai.com/docs/api-reference/completions/create
    model: str | None = None
    prompt: (
        list[Annotated[int, Field(ge=0)]]
        | list[list[Annotated[int, Field(ge=0)]]]
        | str
        | list[str]
        | None
    ) = None

---

class _NDArrayPydanticAnnotation:
    @classmethod
    def __get_pydantic_core_schema__(
        cls,
        _source_type: Any,
        _handler: GetCoreSchemaHandler,
    ) -> core_schema.CoreSchema:
        from_serialized_schema = core_schema.no_info_plain_validator_function(
            _deserialize_ndarray
        )

        return core_schema.json_or_python_schema(
            json_schema=from_serialized_schema,
            python_schema=from_serialized_schema,
            serialization=core_schema.plain_serializer_function_ser_schema(
                _serialize_ndarray
            ),
        )

SerializableNDArray = Annotated[np.ndarray, _NDArrayPydanticAnnotation]

class CompletionRequest(OpenAIBaseModel):
    # Ordered by official OpenAI API documentation
    # https://platform.openai.com/docs/api-reference/completions/create
    model: str | None = None
    images: list[SerializableNDArray] = Field(default_factory=list)
    prompt: (
        list[Annotated[int, Field(ge=0)]]
        | list[list[Annotated[int, Field(ge=0)]]]
        | str
        | list[str]
        | None
    ) = None

RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

class CompletionRequest(OpenAIBaseModel):
    # Ordered by official OpenAI API documentation
    # https://platform.openai.com/docs/api-reference/completions/create
    model: str | None = None
    prompt: (
        list[Annotated[int, Field(ge=0)]]
        | list[list[Annotated[int, Field(ge=0)]]]
        | str
        | list[str]
        | None
    ) = None

See here: https://github.com/vllm-project/vllm/blob/17c47fb8691f2efd7948659952c44ef167462534/vllm/entrypoints/openai/completion/protocol.py#L42-L52

This significantly limits the usage of pretrained models in vLLM.

Proposal:

Let's add: images: np.ndarray | None = None to the CompletionRequest and in case this object is not None we validate that prompt has to be list[int] meaning the prompt already has to be pre-processed by the tokenizer. This way we could add this feature with minimal changes -> no need for extra pre-processing, we can just pass the images directly down into the model definitions.

class _NDArrayPydanticAnnotation:
    @classmethod
    def __get_pydantic_core_schema__(
        cls,
        _source_type: Any,
        _handler: GetCoreSchemaHandler,
    ) -> core_schema.CoreSchema:
        from_serialized_schema = core_schema.no_info_plain_validator_function(
            _deserialize_ndarray
        )

        return core_schema.json_or_python_schema(
            json_schema=from_serialized_schema,
            python_schema=from_serialized_schema,
            serialization=core_schema.plain_serializer_function_ser_schema(
                _serialize_ndarray
            ),
        )

SerializableNDArray = Annotated[np.ndarray, _NDArrayPydanticAnnotation]

class CompletionRequest(OpenAIBaseModel):
    # Ordered by official OpenAI API documentation
    # https://platform.openai.com/docs/api-reference/completions/create
    model: str | None = None
    images: list[SerializableNDArray] = Field(default_factory=list)
    prompt: (
        list[Annotated[int, Field(ge=0)]]
        | list[list[Annotated[int, Field(ge=0)]]]
        | str
        | list[str]
        | None
    ) = None

We assume/enforce that when images are passed both images and prompt are already fully pre-processed and then we can just forward to generate.

Alternatives

We could also think about pre-processing images and prompt in serving.py but maybe this could also just be done in a follow-up PR.

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

Fix Plan

To add support for passing images to the CompletionRequest class, we need to make the following changes:

Add an images field to the CompletionRequest class that accepts a list of SerializableNDArray objects.
Update the validation logic to ensure that when images is not None, the prompt field must be a list of integers (i.e., pre-processed by the tokenizer).

Here's the updated code:

class _NDArrayPydanticAnnotation:
    @classmethod
    def __get_pydantic_core_schema__(
        cls,
        _source_type: Any,
        _handler: GetCoreSchemaHandler,
    ) -> core_schema.CoreSchema:
        from_serialized_schema = core_schema.no_info_plain_validator_function(
            _deserialize_ndarray
        )

        return core_schema.json_or_python_schema(
            json_schema=from_serialized_schema,
            python_schema=from_serialized_schema,
            serialization=core_schema.plain_serializer_function_ser_schema(
                _serialize_ndarray
            ),
        )

SerializableNDArray = Annotated[np.ndarray, _NDArrayPydanticAnnotation]

class CompletionRequest(OpenAIBaseModel):
    # Ordered by official OpenAI API documentation
    # https://platform.openai.com/docs/api-reference/completions/create
    model: str | None = None
    images: list[SerializableNDArray] = Field(default_factory=list)
    prompt: (
        list[Annotated[int, Field(ge=0)]]
        | list[list[Annotated[int, Field(ge=0)]]]
        | str
        | list[str]
        | None
    ) = None

    @root_validator
    def validate_images_and_prompt(cls, values):
        images, prompt = values.get("images"), values.get("prompt")
        if images and not isinstance(prompt, list) or not all(isinstance(x, int) for x in prompt):
            raise ValueError("When images are provided, prompt must be a list of integers")
        return values

Verification

To verify that the fix worked, you can create a CompletionRequest object with an images field and a prompt field that is a list of integers, and then check that the object is valid:

request = CompletionRequest(
    model="Qwen3.5-9B-Base",
    images=[np.array([1, 2, 3])],
    prompt=[1, 2, 3]
)
try:
    request.validate()
    print("Request is valid")
except ValueError as e:
    print(f"Request is invalid: {e}")

Extra Tips

Make sure to update the documentation to reflect the new images field and the updated validation logic.
Consider adding additional validation or error handling to ensure that the images field is properly formatted and can

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #authentication setup #request error #file not found #serialization error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - 💡(How to fix) Fix [Feature]: Allow passing `images` to CompletionRequest [3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

vllm - 💡(How to fix) Fix [Feature]: Allow passing `images` to CompletionRequest [3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING