Fix Action

Fixed

Fixed by PR: Fix: Skip meta device initialization for remote code models (https://github.com/huggingface/transformers/pull/45096)
Fixed by PR: Add old InternVL2-1B/2B support to the InternVL conversion script #45092 (https://github.com/huggingface/transformers/pull/45097)

PR fix notes

PR #45096: Fix: Skip meta device initialization for remote code models

Repository: huggingface/transformers
Author: hkc5
State: open | merged: False
Link: https://github.com/huggingface/transformers/pull/45096

Description (problem / solution / changelog)

Problem

Old remote-code checkpoints (like InternVL2) perform real-tensor operations during model construction (e.g., calling .item() on tensors). This causes RuntimeError: Tensor.item() cannot be called on meta tensors when models are initialized on the meta device, which is the default behavior in Transformers v5+.

This was reported in #45092 and blocks vLLM's Transformers v5 upgrade work.

Solution

Skip meta device initialization for models loaded with trust_remote_code=True. This allows remote code models to work correctly while preserving the memory-efficient initialization for standard models.

Changes

Modified get_init_context() in modeling_utils.py to check cls.is_remote_code() before adding the meta device context
Regular models still use meta device initialization for memory efficiency
Remote code models skip meta device initialization to avoid compatibility issues

Testing

Verified that regular models still use meta device initialization
Verified that remote code models skip meta device initialization
The fix is minimal and targeted, reducing risk of regression

Fixes #45092

Changed files

src/transformers/modeling_utils.py (modified, +5/-1)

PR #45097: Add old InternVL2-1B/2B support to the InternVL conversion script #45092

Repository: huggingface/transformers
Author: baonudesifeizhai
State: open | merged: False
Link: https://github.com/huggingface/transformers/pull/45097

Description (problem / solution / changelog)

What does this PR do?

This PR extends the InternVL conversion script to support the old OpenGVLab/InternVL2-1B and OpenGVLab/InternVL2-2B checkpoints. These checkpoints currently rely on remote code and are problematic for downstream users on Transformers v5. Instead of instantiating the original remote-code models, the converter now reads the original config and weights directly and emits HF-native InternVLForConditionalGeneration checkpoints.

Fixes # (issue) https://github.com/huggingface/transformers/issues/45092 38425

before that : on vllm main branch: pytest
tests/models/multimodal/generation/test_common.py
-k 'intern_vl2-hf-local and test_multi_image_models' -vv broken;

this branch:

VLLM_TEST_INTERNVL2_HF_MODEL=/tmp/InternVL2-1B-hf /root/venv/bin/python -m pytest
tests/models/multimodal/generation/test_common.py
-k 'intern_vl2-hf-local and test_multi_image_models' -vv passed

Validation

Ran the following with source ~/venv/bin/activate:

python src/transformers/models/internvl/convert_internvl_weights_to_hf.py \
  --input_dir OpenGVLab/InternVL2-1B \
  --output_dir /tmp/InternVL2-1B-hf

python src/transformers/models/internvl/convert_internvl_weights_to_hf.py \
  --input_dir OpenGVLab/InternVL2-2B \
  --output_dir /tmp/InternVL2-2B-hf

- [ ] I confirm that this is not a pure code agent PR.

## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#create-a-pull-request),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes? Here are the
      [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and
      [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @

 If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**.
 Please tag fewer than 3 people.

Models:

- text models: @ArthurZucker @Cyrilvallez
- vision models: @yonigozlan @molbap
- audio models: @eustlb @ebezzam @vasqu
- multimodal models: @zucchini-nlp
- graph models: @clefourrier

Library:

- generate: @zucchini-nlp (visual-language models) or @gante (all others)
- continuous batching: @remi-or @ArthurZucker @McPatate
- pipelines: @Rocketknight1
- tokenizers: @ArthurZucker and @itazap
- trainer: @SunMarc
- attention: @vasqu @ArthurZucker @CyrilVallez
- model loading (from pretrained, etc): @CyrilVallez
- distributed: @3outeille @ArthurZucker
- CIs: @ydshieh

Integrations:

- ray/raytune: @richardliaw, @amogkam
- Big Model Inference: @SunMarc
- quantization: @SunMarc
- kernels: @drbh
- peft: @BenjaminBossan @githubnemo

Devices/Backends:

- AMD ROCm: @ivarflakstad
- Intel XPU: @IlyasMoutawwakil
- Ascend NPU: @ivarflakstad 

Documentation: @stevhliu

Research projects are not maintained and should be taken as is.

 -->

## Changed files

- `src/transformers/models/internvl/convert_internvl_weights_to_hf.py` (modified, +124/-61)
- `src/transformers/models/internvl/processing_internvl.py` (modified, +14/-8)

System Info

This is relevant to Transformers because the failure is triggered by the Transformers v5 loading path itself.

In v5, from_pretrained() initializes models on the meta device before loading weights. Old OpenGVLab/InternVL2-* remote-code checkpoints perform real-tensor operations during model construction (for example calling .item()), so they fail under the v5 loading mechanism.

From a downstream user's perspective, this happens directly inside AutoModel.from_pretrained(..., trust_remote_code=True), so it is effectively a Transformers compatibility / migration issue, not only a checkpoint-local issue.

This currently blocks vLLM's Transformers v5 upgrade work because the HF reference model for OpenGVLab/InternVL2-1B/2B cannot be instantiated:

https://github.com/vllm-project/vllm/issues/38425

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

https://github.com/vllm-project/vllm/issues/38425

Expected behavior

https://github.com/vllm-project/vllm/issues/38425

extent analysis

Fix Plan

To resolve the compatibility issue with Transformers v5, we need to modify the model loading process to handle old checkpoints that perform real-tensor operations during model construction.

Step-by-Step Solution

Patch the from_pretrained() method: Modify the from_pretrained() method to initialize models on the CPU device before loading weights, instead of the meta device.
Use a custom loading function: Create a custom loading function that loads the model weights on the CPU device and then moves the model to the desired device.

Example Code

import torch
from transformers import AutoModel

def custom_load_pretrained(model_name, device):
    # Load the model on the CPU device
    model = AutoModel.from_pretrained(model_name, trust_remote_code=True, device_map="auto", torch_dtype="auto")
    # Move the model to the desired device
    model.to(device)
    return model

# Usage
model_name = "OpenGVLab/InternVL2-1B/2B"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = custom_load_pretrained(model_name, device)

Verification

To verify that the fix worked, try loading the model using the custom loading function and check if the model is successfully instantiated and moved to the desired device.

Extra Tips

Make sure to update the transformers library to the latest version to ensure compatibility with the custom loading function.
If you encounter any issues with the custom loading function, try loading the model on the CPU device first and then moving it to the desired device.

transformers - ✅(Solved) Fix [Bug] Old InternVL2 remote-code checkpoints are incompatible with Transformers v5 meta initialization [2 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #45096: Fix: Skip meta device initialization for remote code models

Description (problem / solution / changelog)

Problem

Solution

Changes

Testing

Changed files

PR #45097: Add old InternVL2-1B/2B support to the InternVL conversion script #45092

Description (problem / solution / changelog)

What does this PR do?

Validation

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

Step-by-Step Solution

Example Code

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING