transformers - 💡(How to fix) Fix Qwen3-vl-embedding Video Error "StopIteration" in transformers 5.3.0 [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44560Fetched 2026-04-08 00:27:39
View on GitHub
Comments
3
Participants
2
Timeline
7
Reactions
0
Author
Timeline (top)
commented ×3closed ×1labeled ×1mentioned ×1

Error Message

Loading weights: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 625/625 [00:00<00:00, 1343.66it/s] qwen-vl-utils using decord to read video. Traceback (most recent call last): File "/root/graduation_project/test_video_embedding_local.py", line 23, in <module> embeddings = model.process(inputs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 387, in process outputs = self.forward(processed_inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 193, in forward outputs = self.model(**inputs) ^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 99, in forward outputs = self.model( ^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 843, in wrapper output = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1358, in forward position_ids = self.compute_3d_position_ids( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1252, in compute_3d_position_ids position_ids, rope_deltas = self.get_rope_index( ^^^^^^^^^^^^^^^^^^^^ File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1135, in get_rope_index grid_thw = next(grid_iters[modality_type]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ StopIteration

Code Example

import torch
from Qwen3_VL_Embedding.src.models.qwen3_vl_embedding import Qwen3VLEmbedder

model = Qwen3VLEmbedder(
    model_name_or_path="./models/Qwen3-VL-Embedding-2B",
    # flash_attention_2 for better acceleration and memory saving
    # torch_dtype=torch.bfloat16, 
    # attn_implementation="flash_attention_2"
)

inputs = [
    ## Official example
#     {
#     "text": "A woman playing with her dog on a beach at sunset.",
#     "instruction": "Retrieve images or text relevant to the user's query.",
# }, {
#     "text": "A woman shares a joyful moment with her golden retriever on a sun-drenched beach at sunset, as the dog offers its paw in a heartwarming display of companionship and trust."
# }, {
#     "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
# }, {
#     "text": "A woman shares a joyful moment with her golden retriever on a sun-drenched beach at sunset, as the dog offers its paw in a heartwarming display of companionship and trust.", 
#     "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
# }

    ## What I try
{
    "video":'./data/video/1602562103_1491434826.mp4' # 30min
},{
    "video":"./data/video/115814632003756_35105146583.mp4" # 3min
}
]

embeddings = model.process(inputs)
print(embeddings)
print(embeddings @ embeddings.T)

---

Loading weights: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 625/625 [00:00<00:00, 1343.66it/s]
qwen-vl-utils using decord to read video.
Traceback (most recent call last):
  File "/root/graduation_project/test_video_embedding_local.py", line 23, in <module>
    embeddings = model.process(inputs)
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 387, in process
    outputs = self.forward(processed_inputs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 193, in forward
    outputs = self.model(**inputs)
              ^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 99, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 843, in wrapper
    output = func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1358, in forward
    position_ids = self.compute_3d_position_ids(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1252, in compute_3d_position_ids
    position_ids, rope_deltas = self.get_rope_index(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1135, in get_rope_index
    grid_thw = next(grid_iters[modality_type])
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration
RAW_BUFFERClick to expand / collapse

System Info

transformers version: 5.3.0 Platform: Windows11, WSL2, uv, vscode Python 3.12.13 (main, Mar 3 2026, 14:59:34) [Clang 21.1.4 ] on linux

Who can help?

@zucchini-nlp

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Following README of https://github.com/[QwenLM/Qwen3-VL-Embedding](https://github.com/QwenLM/Qwen3-VL-Embedding/tree/main)/tree/main
  2. Using video for embedding.
import torch
from Qwen3_VL_Embedding.src.models.qwen3_vl_embedding import Qwen3VLEmbedder

model = Qwen3VLEmbedder(
    model_name_or_path="./models/Qwen3-VL-Embedding-2B",
    # flash_attention_2 for better acceleration and memory saving
    # torch_dtype=torch.bfloat16, 
    # attn_implementation="flash_attention_2"
)

inputs = [
    ## Official example
#     {
#     "text": "A woman playing with her dog on a beach at sunset.",
#     "instruction": "Retrieve images or text relevant to the user's query.",
# }, {
#     "text": "A woman shares a joyful moment with her golden retriever on a sun-drenched beach at sunset, as the dog offers its paw in a heartwarming display of companionship and trust."
# }, {
#     "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
# }, {
#     "text": "A woman shares a joyful moment with her golden retriever on a sun-drenched beach at sunset, as the dog offers its paw in a heartwarming display of companionship and trust.", 
#     "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg"
# }

    ## What I try
{
    "video":'./data/video/1602562103_1491434826.mp4' # 30min
},{
    "video":"./data/video/115814632003756_35105146583.mp4" # 3min
}
]

embeddings = model.process(inputs)
print(embeddings)
print(embeddings @ embeddings.T)
Loading weights: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 625/625 [00:00<00:00, 1343.66it/s]
qwen-vl-utils using decord to read video.
Traceback (most recent call last):
  File "/root/graduation_project/test_video_embedding_local.py", line 23, in <module>
    embeddings = model.process(inputs)
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 387, in process
    outputs = self.forward(processed_inputs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 193, in forward
    outputs = self.model(**inputs)
              ^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/Qwen3_VL_Embedding/src/models/qwen3_vl_embedding.py", line 99, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 843, in wrapper
    output = func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1358, in forward
    position_ids = self.compute_3d_position_ids(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1252, in compute_3d_position_ids
    position_ids, rope_deltas = self.get_rope_index(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/root/graduation_project/.venv/lib/python3.12/site-packages/transformers/models/qwen3_vl/modeling_qwen3_vl.py", line 1135, in get_rope_index
    grid_thw = next(grid_iters[modality_type])
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration

Expected behavior

There is no problem with v4.57.6, v5.1.0, v5.2.0 but v5.3.0

extent analysis

Problem Summary

The problem is a crash in the Qwen3_VL_Embedding model when processing video inputs with version 5.3.0 of the transformers library.

Root Cause Analysis

The root cause is likely a change in the transformers library that affects the get_rope_index method in the Qwen3_VL_Embedding model.

Fix Plan

Step 1: Downgrade transformers to version 5.2.0

Downgrade the transformers library to version 5.2.0, which is the last version that worked without issues.

pip install transformers==5.2.0

Step 2: Update code to handle StopIteration exception

Update the code to handle the StopIteration exception that is raised when the get_rope_index method is called.

try:
    grid_thw = next(grid_iters[modality_type])
except StopIteration:
    print("Error: Unable to get rope index")
    # Handle the error or provide a default value

Step 3: Verify the fix

Run the code again with the updated transformers version and verify that it works without crashing.

Verification

To verify that the fix worked, run the code with the updated transformers version and check that it completes without crashing. You can also add print statements or use a debugger to verify that the get_rope_index method is called correctly.

Extra Tips

  • Always check the release notes and changelogs for the library you are using to see if there are any known issues or changes that may affect your code.
  • When downgrading a library, make sure to also downgrade any dependencies that may have changed.
  • Consider using a virtual environment to isolate your project's dependencies and avoid conflicts with other projects.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

There is no problem with v4.57.6, v5.1.0, v5.2.0 but v5.3.0

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - 💡(How to fix) Fix Qwen3-vl-embedding Video Error "StopIteration" in transformers 5.3.0 [3 comments, 2 participants]