vllm - ✅(Solved) Fix [Bug]: NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 offline execution fails [1 pull requests, 1 comments, 2 participants]

vllm2026-04-03 19:14:51

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

vllm-project/vllm#38936•Fetched 2026-04-08 02:44:50

View on GitHub

Comments

Participants

Timeline

Reactions

Author

shilpa-ananth

Participants

shilpa-ananth

sphinxkkkbc

Timeline (top)

commented ×1labeled ×1

Error Message

AttributeError Traceback (most recent call last) File ~/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/multimodal/processing/context.py:269, in InputProcessingContext.call_hf_processor(self, hf_processor, data, kwargs, num_tries, max_tries) 268 try: --> 269 output = hf_processor(**data, **allowed_kwargs) 270 except Exception as exc: 271 # See https://github.com/huggingface/tokenizers/issues/537

File ~/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/transformers_utils/processor.py:538, in call_hf_processor_mm_only(processor, images, videos, audio, **kwargs) 531 def call_hf_processor_mm_only( 532 processor: ProcessorMixin, 533 images: ImageInput | None = None, (...) 536 **kwargs, 537 ) -> BatchFeature: --> 538 output_kwargs = processor._merge_kwargs( 539 get_processor_kwargs_type(processor), 540 **kwargs, 541 ) 543 if audio is not None and ( 544 feature_extractor := getattr(processor, "feature_extractor", None) 545 ):

AttributeError: 'NanoNemotronVLProcessor' object has no attribute '_merge_kwargs'

The above exception was the direct cause of the following exception:

ValueError Traceback (most recent call last) Cell In[7], line 1 ----> 1 llm = LLM( 2 model=model_path, 3 trust_remote_code=True, 4 max_model_len=16384 5 ) ... ...

PR fix notes

PR #39561: [Bugfix]Fix issue #38936 NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 offline execution

Repository: vllm-project/vllm
Author: sphinxkkkbc
State: closed | merged: False
Link: https://github.com/vllm-project/vllm/pull/39561

Description (problem / solution / changelog)

Purpose

Fix issue #38936: NanoNemotronVLProcessor is a custom processor implementation that does not inherit from Huggingface's ProcessorMixin., it lacks standard properties like image_processor and methods like _merge_kwargs. To support it in call_hf_processor_mm_only, I added a fallback path that treats it as a monolithic multimodal processor.

Here I constructs the dummy text manually and calls the processor directly, my implementation of dummy text is same as the function "get_dummy_text" in model_executor/models/nano_nemotron_vl.py.

Test Plan

Environment and scripts:

conda create -n vllm-env python=3.12
pip install uv
uv pip install vllm --torch-backend=cu128

from vllm import LLM, SamplingParams
model_path = "nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16"
llm = LLM(
    model=model_path,
    trust_remote_code=True,
    max_model_len=131072
)

Test Result

Before:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File [~/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/multimodal/processing/context.py:269](http://localhost:8888/lab/tree/1_projects/2_public_safety_engine/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/multimodal/processing/context.py#line=268), in InputProcessingContext.call_hf_processor(self, hf_processor, data, kwargs, num_tries, max_tries)
    268 try:
--> 269     output = hf_processor(**data, **allowed_kwargs)
    270 except Exception as exc:
    271     # See https://github.com/huggingface/tokenizers/issues/537

File [~/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/transformers_utils/processor.py:538](http://localhost:8888/lab/tree/1_projects/2_public_safety_engine/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/transformers_utils/processor.py#line=537), in call_hf_processor_mm_only(processor, images, videos, audio, **kwargs)
    531 def call_hf_processor_mm_only(
    532     processor: ProcessorMixin,
    533     images: ImageInput | None = None,
   (...)
    536     **kwargs,
    537 ) -> BatchFeature:
--> 538     output_kwargs = processor._merge_kwargs(
    539         get_processor_kwargs_type(processor),
    540         **kwargs,
    541     )
    543     if audio is not None and (
    544         feature_extractor := getattr(processor, "feature_extractor", None)
    545     ):

AttributeError: 'NanoNemotronVLProcessor' object has no attribute '_merge_kwargs'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[7], line 1
----> 1 llm = LLM(
      2     model=model_path,
      3     trust_remote_code=True,
      4     max_model_len=16384
      5 )
...
...

After: On my RTX4090 I set quantization to save GPU memory, test script include text, image and video:

vllm serve ./NVIDIA-Nemotron-Nano-12B-VL-BF16 \
    --host 0.0.0.0 \
    --port 8000 \
    --trust-remote-code \
    --max-model-len 12288 \
    --enforce-eager \
    --quantization fp8

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "./NVIDIA-Nemotron-Nano-12B-VL-BF16",
    "messages": [
      {"role": "user", "content": "请用中文简要解释什么是深度学习？"}
    ],
    "max_tokens": 200,
    "temperature": 0.7
  }'
{"id":"chatcmpl-91f47bba3640b3c7","object":"chat.completion","created":1775890208,"model":"./NVIDIA-Nemotron-Nano-12B-VL-BF16","choices":[{"index":0,"message":{"role":"assistant","content":"深度学习是一种机器学习方法，它使用由多层神经网络构成的人工智能模型，能够自动从数据中学习到复杂的特征和模式。这些模型通过多层次的计算，逐步提取越来越高级的特征，从而能够完成图像识别、语音识别、自然语言处理等任务。深度学习的核心在于其“深度”的网络结构，使其能够处理大量数据并发现人类难以察觉的规律。\n","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":29,"total_tokens":171,"completion_tokens":142,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "./NVIDIA-Nemotron-Nano-12B-VL-BF16",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image_url",
            "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}
          },
          {"type": "text", "text": "请描述这张图片中的文字和内容。"}
        ]
      }
    ],
    "max_tokens": 200
  }'
{"id":"chatcmpl-92b8467ed9076de4","object":"chat.completion","created":1775890224,"model":"./NVIDIA-Nemotron-Nano-12B-VL-BF16","choices":[{"index":0,"message":{"role":"assistant","content":"TONGYI\nQwen\n","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":2847,"total_tokens":2856,"completion_tokens":9,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "./NVIDIA-Nemotron-Nano-12B-VL-BF16",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image_url",
            "image_url": {"url": "https://copyright.bdstatic.com/vcg/creative/cc9c744cf9f7c864889c563cbdeddce6.jpg"}
          },
          {"type": "text", "text": "这张图片里有什么物体？请详细描述。"}
        ]
      }
    ],
    "max_tokens": 300
  }'
{"id":"chatcmpl-aad91db9491627f8","object":"chat.completion","created":1775890271,"model":"./NVIDIA-Nemotron-Nano-12B-VL-BF16","choices":[{"index":0,"message":{"role":"assistant","content":"### 图片中的物体\n\n图片中主要包含两种主要物体，水果和植物。\n\n#### 水果：樱桃\n\n- **颜色**：鲜艳的红色，表明它们非常成熟。\n- **形状**：球形，表面光滑。\n- **排列**：莲花状地聚集在枝条上，有多个果簇。\n- **大小**：大小一致，每个大约有标准樱桃的大小。\n- **状态**：完整无损，成熟且准备好采摘。\n\n#### 植物：樱桃树\n\n- **枝条**：呈现绿色，支撑着樱桃的簇。\n- **叶**：绿色，呈椭圆形，边缘有锯齿，在枝条上交替排列。\n- **背景**：绿色的，暗示着树木所在的鲜艳环境。\n\n这是一幅典型的樱桃树果实图像，展示了树木在授粉后的肥沃状态。\n","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":1829,"total_tokens":2104,"completion_tokens":275,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "./NVIDIA-Nemotron-Nano-12B-VL-BF16",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "video_url",
            "video_url": {"url": "https://media.w3.org/2010/05/sintel/trailer.mp4"}
          },
          {"type": "text", "text": "请描述这个视频的主要内容。"}
        ]
      }
    ],
    "max_tokens": 300
  }'
{"id":"chatcmpl-973f157cc83e13ff","object":"chat.completion","created":1775890297,"model":"./NVIDIA-Nemotron-Nano-12B-VL-BF16","choices":[{"index":0,"message":{"role":"assistant","content":"The video showcased an animated movie titled \"Sintel.\" The story unfolds in a cold climate, where the lead character disappears mysteriously. A group of individuals humorously send emails to Chuck because they had a shared dream—Chuck contacted the Queen, who used a rat to find the missing lead character via video call. The community decided to create an animated movie to help locate the person. Along the journey, the lead character is seen with red hair and yellow wings. The video concluded with the lead character being identified and she thanks the community.\n","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":8779,"total_tokens":8892,"completion_tokens":113,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}

<details> <summary> Essential Elements of an Effective PR Description Checklist </summary>

[✅] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
[✅] The test plan, such as providing test command.
[✅] The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

</details>

Changed files

vllm/transformers_utils/processor.py (modified, +39/-5)

Code Example

Collecting environment information...
==============================
        System Info
==============================
OS                           : Ubuntu 22.04.5 LTS (x86_64)
GCC version                  : (Ubuntu 11.4.0-1ubuntu1~22.04.3) 11.4.0
Clang version                : Could not collect
CMake version                : version 3.22.1
Libc version                 : glibc-2.35

==============================
       PyTorch Info
==============================
PyTorch version              : 2.10.0+cu128
Is debug build               : False
CUDA used to build PyTorch   : 12.8
ROCM used to build PyTorch   : N/A

==============================
      Python Environment
==============================
Python version               : 3.12.13 | packaged by Anaconda, Inc. | (main, Mar 19 2026, 20:20:58) [GCC 14.3.0] (64-bit runtime)
Python platform              : Linux-6.8.0-1038-oracle-x86_64-with-glibc2.35

==============================
       CUDA / GPU Info
==============================
Is CUDA available            : True
CUDA runtime version         : Could not collect
CUDA_MODULE_LOADING set to   : 
GPU models and configuration : 
GPU 0: NVIDIA A100-SXM4-80GB
GPU 1: NVIDIA A100-SXM4-80GB
GPU 2: NVIDIA A100-SXM4-80GB
GPU 3: NVIDIA A100-SXM4-80GB

Nvidia driver version        : 590.48.01
cuDNN version                : Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.19.1
HIP runtime version          : N/A
MIOpen runtime version       : N/A
Is XNNPACK available         : True

==============================
          CPU Info
==============================
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        48 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               256
On-line CPU(s) list:                  0-127
Off-line CPU(s) list:                 128-255
Vendor ID:                            AuthenticAMD
Model name:                           AMD EPYC 7J13 64-Core Processor
CPU family:                           25
Model:                                1
Thread(s) per core:                   1
Core(s) per socket:                   64
Socket(s):                            2
Stepping:                             1
Frequency boost:                      enabled
CPU max MHz:                          2550.0000
CPU min MHz:                          0.0000
BogoMIPS:                             4900.16
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin brs arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm debug_swap
L1d cache:                            4 MiB (128 instances)
L1i cache:                            4 MiB (128 instances)
L2 cache:                             64 MiB (128 instances)
L3 cache:                             512 MiB (16 instances)
NUMA node(s):                         8
NUMA node0 CPU(s):                    0-15
NUMA node1 CPU(s):                    16-31
NUMA node2 CPU(s):                    32-47
NUMA node3 CPU(s):                    48-63
NUMA node4 CPU(s):                    64-79
NUMA node5 CPU(s):                    80-95
NUMA node6 CPU(s):                    96-111
NUMA node7 CPU(s):                    112-127
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Not affected
Vulnerability Spec rstack overflow:   Vulnerable
Vulnerability Spec store bypass:      Vulnerable
Vulnerability Spectre v1:             Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers
Vulnerability Spectre v2:             Vulnerable; IBPB: disabled; STIBP: disabled; PBRSB-eIBRS: Not affected; BHI: Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

==============================
Versions of relevant libraries
==============================
[pip3] flashinfer-python==0.6.6
[pip3] numpy==2.2.6
[pip3] nvidia-cublas-cu12==12.8.4.1
[pip3] nvidia-cuda-cupti-cu12==12.8.90
[pip3] nvidia-cuda-nvrtc-cu12==12.8.93
[pip3] nvidia-cuda-runtime-cu12==12.8.90
[pip3] nvidia-cudnn-cu12==9.10.2.21
[pip3] nvidia-cudnn-frontend==1.18.0
[pip3] nvidia-cufft-cu12==11.3.3.83
[pip3] nvidia-cufile-cu12==1.13.1.3
[pip3] nvidia-curand-cu12==10.3.9.90
[pip3] nvidia-cusolver-cu12==11.7.3.90
[pip3] nvidia-cusparse-cu12==12.5.8.93
[pip3] nvidia-cusparselt-cu12==0.7.1
[pip3] nvidia-cutlass-dsl==4.4.2
[pip3] nvidia-cutlass-dsl-libs-base==4.4.2
[pip3] nvidia-ml-py==13.595.45
[pip3] nvidia-nccl-cu12==2.27.5
[pip3] nvidia-nvjitlink-cu12==12.8.93
[pip3] nvidia-nvshmem-cu12==3.4.5
[pip3] nvidia-nvtx-cu12==12.8.90
[pip3] pyzmq==27.1.0
[pip3] torch==2.10.0+cu128
[pip3] torch_c_dlpack_ext==0.1.5
[pip3] torchaudio==2.10.0+cu128
[pip3] torchvision==0.25.0+cu128
[pip3] transformers==4.57.6
[pip3] triton==3.6.0
[conda] flashinfer-python                           0.6.6            pypi_0           pypi
[conda] numpy                                       2.2.6            pypi_0           pypi
[conda] nvidia-cublas-cu12                          12.8.4.1         pypi_0           pypi
[conda] nvidia-cuda-cupti-cu12                      12.8.90          pypi_0           pypi
[conda] nvidia-cuda-nvrtc-cu12                      12.8.93          pypi_0           pypi
[conda] nvidia-cuda-runtime-cu12                    12.8.90          pypi_0           pypi
[conda] nvidia-cudnn-cu12                           9.10.2.21        pypi_0           pypi
[conda] nvidia-cudnn-frontend                       1.18.0           pypi_0           pypi
[conda] nvidia-cufft-cu12                           11.3.3.83        pypi_0           pypi
[conda] nvidia-cufile-cu12                          1.13.1.3         pypi_0           pypi
[conda] nvidia-curand-cu12                          10.3.9.90        pypi_0           pypi
[conda] nvidia-cusolver-cu12                        11.7.3.90        pypi_0           pypi
[conda] nvidia-cusparse-cu12                        12.5.8.93        pypi_0           pypi
[conda] nvidia-cusparselt-cu12                      0.7.1            pypi_0           pypi
[conda] nvidia-cutlass-dsl                          4.4.2            pypi_0           pypi
[conda] nvidia-cutlass-dsl-libs-base                4.4.2            pypi_0           pypi
[conda] nvidia-ml-py                                13.595.45        pypi_0           pypi
[conda] nvidia-nccl-cu12                            2.27.5           pypi_0           pypi
[conda] nvidia-nvjitlink-cu12                       12.8.93          pypi_0           pypi
[conda] nvidia-nvshmem-cu12                         3.4.5            pypi_0           pypi
[conda] nvidia-nvtx-cu12                            12.8.90          pypi_0           pypi
[conda] pyzmq                                       27.1.0           py312hcf8288c_1
[conda] torch                                       2.10.0+cu128     pypi_0           pypi
[conda] torch-c-dlpack-ext                          0.1.5            pypi_0           pypi
[conda] torchaudio                                  2.10.0+cu128     pypi_0           pypi
[conda] torchvision                                 0.25.0+cu128     pypi_0           pypi
[conda] transformers                                4.57.6           pypi_0           pypi
[conda] triton                                      3.6.0            pypi_0           pypi

==============================
         vLLM Info
==============================
ROCM Version                 : Could not collect
vLLM Version                 : 0.19.0
vLLM Build Flags:
  CUDA Archs: 7.5;8.0;8.6;8.9;9.0;10.0;10.3;11.0;12.0;12.1+PTX; ROCm: Disabled
GPU Topology:
        GPU0    GPU1    GPU2    GPU3    NIC0    NIC1    NIC2    NIC3    NIC4    NIC5    NIC6    NIC7    NIC8    NIC9    NIC10   NIC11   NIC12   NIC13   NIC14   NIC15   NIC16   NIC17   CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      NV12    NV12    NV12    SYS     SYS     SYS     SYS     SYS     PXB     PXB     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     48-55   3              N/A
GPU1    NV12     X      NV12    NV12    SYS     SYS     SYS     SYS     SYS     PXB     PXB     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     48-55   3              N/A
GPU2    NV12    NV12     X      NV12    SYS     PXB     PXB     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     16-23   1              N/A
GPU3    NV12    NV12    NV12     X      SYS     PXB     PXB     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     16-23   1              N/A
NIC0    SYS     SYS     SYS     SYS      X      SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC1    SYS     SYS     PXB     PXB     SYS      X      PIX     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC2    SYS     SYS     PXB     PXB     SYS     PIX      X      PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC3    SYS     SYS     PXB     PXB     SYS     PXB     PXB      X      PIX     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC4    SYS     SYS     PXB     PXB     SYS     PXB     PXB     PIX      X      SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC5    PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS      X      PIX     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC6    PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PIX      X      PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC7    PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB      X      PIX     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC8    PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB     PIX      X      SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC9    SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS      X      PIX     PXB     PXB     SYS     SYS     SYS     SYS     SYS
NIC10   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PIX      X      PXB     PXB     SYS     SYS     SYS     SYS     SYS
NIC11   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB      X      PIX     SYS     SYS     SYS     SYS     SYS
NIC12   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB     PIX      X      SYS     SYS     SYS     SYS     SYS
NIC13   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS      X      SYS     SYS     SYS     SYS
NIC14   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS      X      PIX     PXB     PXB
NIC15   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PIX      X      PXB     PXB
NIC16   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB      X      PIX
NIC17   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB     PIX      X 

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

NIC Legend:

  NIC0: mlx5_0
  NIC1: mlx5_1
  NIC2: mlx5_2
  NIC3: mlx5_3
  NIC4: mlx5_4
  NIC5: mlx5_5
  NIC6: mlx5_6
  NIC7: mlx5_7
  NIC8: mlx5_8
  NIC9: mlx5_9
  NIC10: mlx5_10
  NIC11: mlx5_11
  NIC12: mlx5_12
  NIC13: mlx5_13
  NIC14: mlx5_14
  NIC15: mlx5_15
  NIC16: mlx5_16
  NIC17: mlx5_17

==============================
     Environment Variables
==============================
TORCH_CUDA_ARCH_LIST=7.5;8.0;8.6;8.9;9.0;10.0;10.3;11.0;12.0;12.1+PTX
CUDAARCHS=75-real;80-real;86-real;89-real;90-real;100-real;103-real;110-real;120-real;121
CUDA_VERSION=130
CUDA_VISIBLE_DEVICES=0,1,2,3
CUDA_VISIBLE_DEVICES=0,1,2,3
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
TORCHINDUCTOR_CACHE_DIR=/tmp/torchinductor_[redacted]

---

conda create -n vllm-env python=3.12
pip install uv
uv pip install vllm --torch-backend=cu128

---

from vllm import LLM, SamplingParams
model_path = "nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16"
llm = LLM(
    model=model_path,
    trust_remote_code=True,
    max_model_len=131072
)

---

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File [~/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/multimodal/processing/context.py:269](http://localhost:8888/lab/tree/1_projects/2_public_safety_engine/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/multimodal/processing/context.py#line=268), in InputProcessingContext.call_hf_processor(self, hf_processor, data, kwargs, num_tries, max_tries)
    268 try:
--> 269     output = hf_processor(**data, **allowed_kwargs)
    270 except Exception as exc:
    271     # See https://github.com/huggingface/tokenizers/issues/537

File [~/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/transformers_utils/processor.py:538](http://localhost:8888/lab/tree/1_projects/2_public_safety_engine/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/transformers_utils/processor.py#line=537), in call_hf_processor_mm_only(processor, images, videos, audio, **kwargs)
    531 def call_hf_processor_mm_only(
    532     processor: ProcessorMixin,
    533     images: ImageInput | None = None,
   (...)
    536     **kwargs,
    537 ) -> BatchFeature:
--> 538     output_kwargs = processor._merge_kwargs(
    539         get_processor_kwargs_type(processor),
    540         **kwargs,
    541     )
    543     if audio is not None and (
    544         feature_extractor := getattr(processor, "feature_extractor", None)
    545     ):

AttributeError: 'NanoNemotronVLProcessor' object has no attribute '_merge_kwargs'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[7], line 1
----> 1 llm = LLM(
      2     model=model_path,
      3     trust_remote_code=True,
      4     max_model_len=16384
      5 )
...
...

RAW_BUFFERClick to expand / collapse

Your current environment

<details> <summary>The output of <code>python collect_env.py</code></summary>

Collecting environment information...
==============================
        System Info
==============================
OS                           : Ubuntu 22.04.5 LTS (x86_64)
GCC version                  : (Ubuntu 11.4.0-1ubuntu1~22.04.3) 11.4.0
Clang version                : Could not collect
CMake version                : version 3.22.1
Libc version                 : glibc-2.35

==============================
       PyTorch Info
==============================
PyTorch version              : 2.10.0+cu128
Is debug build               : False
CUDA used to build PyTorch   : 12.8
ROCM used to build PyTorch   : N/A

==============================
      Python Environment
==============================
Python version               : 3.12.13 | packaged by Anaconda, Inc. | (main, Mar 19 2026, 20:20:58) [GCC 14.3.0] (64-bit runtime)
Python platform              : Linux-6.8.0-1038-oracle-x86_64-with-glibc2.35

==============================
       CUDA / GPU Info
==============================
Is CUDA available            : True
CUDA runtime version         : Could not collect
CUDA_MODULE_LOADING set to   : 
GPU models and configuration : 
GPU 0: NVIDIA A100-SXM4-80GB
GPU 1: NVIDIA A100-SXM4-80GB
GPU 2: NVIDIA A100-SXM4-80GB
GPU 3: NVIDIA A100-SXM4-80GB

Nvidia driver version        : 590.48.01
cuDNN version                : Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.19.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.19.1
HIP runtime version          : N/A
MIOpen runtime version       : N/A
Is XNNPACK available         : True

==============================
          CPU Info
==============================
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        48 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               256
On-line CPU(s) list:                  0-127
Off-line CPU(s) list:                 128-255
Vendor ID:                            AuthenticAMD
Model name:                           AMD EPYC 7J13 64-Core Processor
CPU family:                           25
Model:                                1
Thread(s) per core:                   1
Core(s) per socket:                   64
Socket(s):                            2
Stepping:                             1
Frequency boost:                      enabled
CPU max MHz:                          2550.0000
CPU min MHz:                          0.0000
BogoMIPS:                             4900.16
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin brs arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm debug_swap
L1d cache:                            4 MiB (128 instances)
L1i cache:                            4 MiB (128 instances)
L2 cache:                             64 MiB (128 instances)
L3 cache:                             512 MiB (16 instances)
NUMA node(s):                         8
NUMA node0 CPU(s):                    0-15
NUMA node1 CPU(s):                    16-31
NUMA node2 CPU(s):                    32-47
NUMA node3 CPU(s):                    48-63
NUMA node4 CPU(s):                    64-79
NUMA node5 CPU(s):                    80-95
NUMA node6 CPU(s):                    96-111
NUMA node7 CPU(s):                    112-127
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Not affected
Vulnerability Spec rstack overflow:   Vulnerable
Vulnerability Spec store bypass:      Vulnerable
Vulnerability Spectre v1:             Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers
Vulnerability Spectre v2:             Vulnerable; IBPB: disabled; STIBP: disabled; PBRSB-eIBRS: Not affected; BHI: Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

==============================
Versions of relevant libraries
==============================
[pip3] flashinfer-python==0.6.6
[pip3] numpy==2.2.6
[pip3] nvidia-cublas-cu12==12.8.4.1
[pip3] nvidia-cuda-cupti-cu12==12.8.90
[pip3] nvidia-cuda-nvrtc-cu12==12.8.93
[pip3] nvidia-cuda-runtime-cu12==12.8.90
[pip3] nvidia-cudnn-cu12==9.10.2.21
[pip3] nvidia-cudnn-frontend==1.18.0
[pip3] nvidia-cufft-cu12==11.3.3.83
[pip3] nvidia-cufile-cu12==1.13.1.3
[pip3] nvidia-curand-cu12==10.3.9.90
[pip3] nvidia-cusolver-cu12==11.7.3.90
[pip3] nvidia-cusparse-cu12==12.5.8.93
[pip3] nvidia-cusparselt-cu12==0.7.1
[pip3] nvidia-cutlass-dsl==4.4.2
[pip3] nvidia-cutlass-dsl-libs-base==4.4.2
[pip3] nvidia-ml-py==13.595.45
[pip3] nvidia-nccl-cu12==2.27.5
[pip3] nvidia-nvjitlink-cu12==12.8.93
[pip3] nvidia-nvshmem-cu12==3.4.5
[pip3] nvidia-nvtx-cu12==12.8.90
[pip3] pyzmq==27.1.0
[pip3] torch==2.10.0+cu128
[pip3] torch_c_dlpack_ext==0.1.5
[pip3] torchaudio==2.10.0+cu128
[pip3] torchvision==0.25.0+cu128
[pip3] transformers==4.57.6
[pip3] triton==3.6.0
[conda] flashinfer-python                           0.6.6            pypi_0           pypi
[conda] numpy                                       2.2.6            pypi_0           pypi
[conda] nvidia-cublas-cu12                          12.8.4.1         pypi_0           pypi
[conda] nvidia-cuda-cupti-cu12                      12.8.90          pypi_0           pypi
[conda] nvidia-cuda-nvrtc-cu12                      12.8.93          pypi_0           pypi
[conda] nvidia-cuda-runtime-cu12                    12.8.90          pypi_0           pypi
[conda] nvidia-cudnn-cu12                           9.10.2.21        pypi_0           pypi
[conda] nvidia-cudnn-frontend                       1.18.0           pypi_0           pypi
[conda] nvidia-cufft-cu12                           11.3.3.83        pypi_0           pypi
[conda] nvidia-cufile-cu12                          1.13.1.3         pypi_0           pypi
[conda] nvidia-curand-cu12                          10.3.9.90        pypi_0           pypi
[conda] nvidia-cusolver-cu12                        11.7.3.90        pypi_0           pypi
[conda] nvidia-cusparse-cu12                        12.5.8.93        pypi_0           pypi
[conda] nvidia-cusparselt-cu12                      0.7.1            pypi_0           pypi
[conda] nvidia-cutlass-dsl                          4.4.2            pypi_0           pypi
[conda] nvidia-cutlass-dsl-libs-base                4.4.2            pypi_0           pypi
[conda] nvidia-ml-py                                13.595.45        pypi_0           pypi
[conda] nvidia-nccl-cu12                            2.27.5           pypi_0           pypi
[conda] nvidia-nvjitlink-cu12                       12.8.93          pypi_0           pypi
[conda] nvidia-nvshmem-cu12                         3.4.5            pypi_0           pypi
[conda] nvidia-nvtx-cu12                            12.8.90          pypi_0           pypi
[conda] pyzmq                                       27.1.0           py312hcf8288c_1
[conda] torch                                       2.10.0+cu128     pypi_0           pypi
[conda] torch-c-dlpack-ext                          0.1.5            pypi_0           pypi
[conda] torchaudio                                  2.10.0+cu128     pypi_0           pypi
[conda] torchvision                                 0.25.0+cu128     pypi_0           pypi
[conda] transformers                                4.57.6           pypi_0           pypi
[conda] triton                                      3.6.0            pypi_0           pypi

==============================
         vLLM Info
==============================
ROCM Version                 : Could not collect
vLLM Version                 : 0.19.0
vLLM Build Flags:
  CUDA Archs: 7.5;8.0;8.6;8.9;9.0;10.0;10.3;11.0;12.0;12.1+PTX; ROCm: Disabled
GPU Topology:
        GPU0    GPU1    GPU2    GPU3    NIC0    NIC1    NIC2    NIC3    NIC4    NIC5    NIC6    NIC7    NIC8    NIC9    NIC10   NIC11   NIC12   NIC13   NIC14   NIC15   NIC16   NIC17   CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      NV12    NV12    NV12    SYS     SYS     SYS     SYS     SYS     PXB     PXB     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     48-55   3              N/A
GPU1    NV12     X      NV12    NV12    SYS     SYS     SYS     SYS     SYS     PXB     PXB     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     48-55   3              N/A
GPU2    NV12    NV12     X      NV12    SYS     PXB     PXB     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     16-23   1              N/A
GPU3    NV12    NV12    NV12     X      SYS     PXB     PXB     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     16-23   1              N/A
NIC0    SYS     SYS     SYS     SYS      X      SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC1    SYS     SYS     PXB     PXB     SYS      X      PIX     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC2    SYS     SYS     PXB     PXB     SYS     PIX      X      PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC3    SYS     SYS     PXB     PXB     SYS     PXB     PXB      X      PIX     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC4    SYS     SYS     PXB     PXB     SYS     PXB     PXB     PIX      X      SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC5    PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS      X      PIX     PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC6    PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PIX      X      PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC7    PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB      X      PIX     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC8    PXB     PXB     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB     PIX      X      SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS
NIC9    SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS      X      PIX     PXB     PXB     SYS     SYS     SYS     SYS     SYS
NIC10   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PIX      X      PXB     PXB     SYS     SYS     SYS     SYS     SYS
NIC11   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB      X      PIX     SYS     SYS     SYS     SYS     SYS
NIC12   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB     PIX      X      SYS     SYS     SYS     SYS     SYS
NIC13   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS      X      SYS     SYS     SYS     SYS
NIC14   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS      X      PIX     PXB     PXB
NIC15   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PIX      X      PXB     PXB
NIC16   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB      X      PIX
NIC17   SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     SYS     PXB     PXB     PIX      X 

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

NIC Legend:

  NIC0: mlx5_0
  NIC1: mlx5_1
  NIC2: mlx5_2
  NIC3: mlx5_3
  NIC4: mlx5_4
  NIC5: mlx5_5
  NIC6: mlx5_6
  NIC7: mlx5_7
  NIC8: mlx5_8
  NIC9: mlx5_9
  NIC10: mlx5_10
  NIC11: mlx5_11
  NIC12: mlx5_12
  NIC13: mlx5_13
  NIC14: mlx5_14
  NIC15: mlx5_15
  NIC16: mlx5_16
  NIC17: mlx5_17

==============================
     Environment Variables
==============================
TORCH_CUDA_ARCH_LIST=7.5;8.0;8.6;8.9;9.0;10.0;10.3;11.0;12.0;12.1+PTX
CUDAARCHS=75-real;80-real;86-real;89-real;90-real;100-real;103-real;110-real;120-real;121
CUDA_VERSION=130
CUDA_VISIBLE_DEVICES=0,1,2,3
CUDA_VISIBLE_DEVICES=0,1,2,3
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
TORCHINDUCTOR_CACHE_DIR=/tmp/torchinductor_[redacted]

</details>

🐛 Describe the bug

I have a new conda environment

conda create -n vllm-env python=3.12
pip install uv
uv pip install vllm --torch-backend=cu128

I am trying to use the following model Nemotron-Nano-12B-v2-VL to work.

from vllm import LLM, SamplingParams
model_path = "nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16"
llm = LLM(
    model=model_path,
    trust_remote_code=True,
    max_model_len=131072
)

This fails with

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File [~/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/multimodal/processing/context.py:269](http://localhost:8888/lab/tree/1_projects/2_public_safety_engine/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/multimodal/processing/context.py#line=268), in InputProcessingContext.call_hf_processor(self, hf_processor, data, kwargs, num_tries, max_tries)
    268 try:
--> 269     output = hf_processor(**data, **allowed_kwargs)
    270 except Exception as exc:
    271     # See https://github.com/huggingface/tokenizers/issues/537

File [~/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/transformers_utils/processor.py:538](http://localhost:8888/lab/tree/1_projects/2_public_safety_engine/miniconda3/envs/vllm-env-1/lib/python3.10/site-packages/vllm/transformers_utils/processor.py#line=537), in call_hf_processor_mm_only(processor, images, videos, audio, **kwargs)
    531 def call_hf_processor_mm_only(
    532     processor: ProcessorMixin,
    533     images: ImageInput | None = None,
   (...)
    536     **kwargs,
    537 ) -> BatchFeature:
--> 538     output_kwargs = processor._merge_kwargs(
    539         get_processor_kwargs_type(processor),
    540         **kwargs,
    541     )
    543     if audio is not None and (
    544         feature_extractor := getattr(processor, "feature_extractor", None)
    545     ):

AttributeError: 'NanoNemotronVLProcessor' object has no attribute '_merge_kwargs'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[7], line 1
----> 1 llm = LLM(
      2     model=model_path,
      3     trust_remote_code=True,
      4     max_model_len=16384
      5 )
...
...

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

extent analysis

TL;DR

The error is likely due to an incompatible version of the transformers library used by the vllm package, causing an AttributeError when trying to access the _merge_kwargs method of the NanoNemotronVLProcessor object.

Guidance

Check the version of the transformers library: Ensure that the version of the transformers library installed in the environment is compatible with the vllm package.
Update the transformers library: Try updating the transformers library to the latest version using pip install --upgrade transformers.
Verify the vllm package version: Check the version of the vllm package installed in the environment and ensure it is the latest version.
Check for known issues: Search the vllm package documentation and issue tracker for known issues related to the AttributeError exception.

Example

No code example is provided as the issue is related to a specific library version compatibility.

Notes

The error message suggests that the NanoNemotronVLProcessor object does not have a _merge_kwargs method, which is likely due to a version mismatch between the vllm package and the transformers library.

Recommendation

Apply a workaround by downgrading the transformers library to a version known to be compatible with the vllm package, or wait for an update to the vllm package that addresses the compatibility issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #environment variable #network issue #logging issue #authentication issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

vllm - ✅(Solved) Fix [Bug]: NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 offline execution fails [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #39561: [Bugfix]Fix issue #38936 NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 offline execution

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

vllm - ✅(Solved) Fix [Bug]: NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 offline execution fails [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

PR fix notes

PR #39561: [Bugfix]Fix issue #38936 NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 offline execution

Description (problem / solution / changelog)

Purpose

Test Plan

Test Result

Changed files

Code Example

Your current environment

🐛 Describe the bug

Before submitting a new issue...

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING