transformers - ✅(Solved) Fix Unpacking Qwen3.5 input embeddings fails with trl SFT trainer [1 pull requests, 3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44918Fetched 2026-04-08 01:12:31
View on GitHub
Comments
3
Participants
2
Timeline
8
Reactions
0
Timeline (top)
commented ×3closed ×1cross-referenced ×1labeled ×1

Error Message

Traceback (most recent call last): File "/workspace/writeable/open-ended-csp/tmp.py", line 41, in <module> train_sft( File "/workspace/writeable/open-ended-csp/utils/sft_trainer.py", line 52, in train_sft trainer.train() File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1412, in train return inner_training_loop( ^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1742, in _inner_training_loop tr_loss_step = self.training_step(model, inputs, num_items_in_batch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 1338, in training_step return super().training_step(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1951, in training_step loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 1234, in compute_loss (loss, outputs) = super().compute_loss( ^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 2022, in compute_loss outputs = model(**inputs) ^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 823, in forward return model_forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 811, in call return convert_to_fp32(self.model_forward(*args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 841, in wrapper output = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1938, in forward outputs = self.model( ^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 841, in wrapper output = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1685, in forward position_ids = self.compute_3d_position_ids( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1619, in compute_3d_position_ids batch_size, seq_length, _ = inputs_embeds.shape ^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: too many values to unpack (expected 3) wandb: wandb: 🚀 View run test-qwen3.5 at: https://wandb.ai/jj-uoe/open-ended-csp/runs/1sercncq wandb: Find logs at: wandb/run-20260321_232655-1sercncq/logs

Fix Action

Fixed

PR fix notes

PR #44921: fix: use shape index access in compute_3d_position_ids for Qwen VL models

Description (problem / solution / changelog)

What does this PR do?

Fixes #44918.

compute_3d_position_ids in the Qwen2.5-VL / Qwen3-VL / Qwen3.5 model families destructures inputs_embeds.shape into exactly three variables:

batch_size, seq_length, _ = inputs_embeds.shape

This raises ValueError: too many values to unpack (expected 3) when inputs_embeds has more than three dimensions, which can happen when:

  • The TRL SFT Trainer passes inputs_embeds directly (without input_ids) after processing, producing a batch with an extra dimension
  • stale rope_deltas from a preceding generation step (e.g. during evaluation) causes the elif branch to fire on a subsequent training forward pass

Fix

Replace destructuring with explicit index access:

batch_size, seq_length = inputs_embeds.shape[0], inputs_embeds.shape[1]

This is robust to any tensor with ≥ 2 dimensions and does not change behaviour for the standard 3-D case.

Affected files

  • models/qwen2_5_vl/modular_qwen2_5_vl.py — source modular definition
  • models/qwen2_5_vl/modeling_qwen2_5_vl.py — generated file
  • models/qwen3_vl/modeling_qwen3_vl.py — generated file (inherits the method)
  • models/qwen3_5/modeling_qwen3_5.py — generated file (inherits the method)

Changed files

  • src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py (modified, +1/-1)
  • src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py (modified, +1/-1)
  • src/transformers/models/qwen3_5/modeling_qwen3_5.py (modified, +1/-1)
  • src/transformers/models/qwen3_vl/modeling_qwen3_vl.py (modified, +1/-1)

Code Example

datasets==4.6.1
faiss-cpu==1.13.2
numpy==2.4.2
pyserini==1.5.0
sentence-transformers==5.2.3
torch==2.10.0
torchvision==0.25.0
tqdm==4.67.3
trl==0.29.1
wandb==0.25.1

---

config = SFTConfig(
        output_dir=checkpoints_path,
        max_steps=num_steps,
        dataset_text_field="text",
        max_length=2048,
        logging_strategy="steps",
        logging_steps=max(1, checkpoint_frequency // 5),
        disable_tqdm=False,
        save_strategy="steps",
        save_steps=checkpoint_frequency,
        eval_strategy="steps",
        eval_steps=checkpoint_frequency,
        load_best_model_at_end=True,
        metric_for_best_model="eval_loss",
        greater_is_better=False,
        report_to="wandb",
        run_name=run_name,
)

trainer = SFTTrainer(
        model=lm.model,
        args=config,
        train_dataset=train_ds,
        eval_dataset=eval_ds,
)
trainer.train()
trainer.save_model()

---

Traceback (most recent call last):
  File "/workspace/writeable/open-ended-csp/tmp.py", line 41, in <module>
    train_sft(
  File "/workspace/writeable/open-ended-csp/utils/sft_trainer.py", line 52, in train_sft
    trainer.train()
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1412, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1742, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 1338, in training_step
    return super().training_step(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1951, in training_step
    loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 1234, in compute_loss
    (loss, outputs) = super().compute_loss(
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 2022, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 823, in forward
    return model_forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 811, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 841, in wrapper
    output = func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1938, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 841, in wrapper
    output = func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1685, in forward
    position_ids = self.compute_3d_position_ids(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1619, in compute_3d_position_ids
    batch_size, seq_length, _ = inputs_embeds.shape
    ^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 3)
wandb: 
wandb: 🚀 View run test-qwen3.5 at: https://wandb.ai/jj-uoe/open-ended-csp/runs/1sercncq
wandb: Find logs at: wandb/run-20260321_232655-1sercncq/logs
RAW_BUFFERClick to expand / collapse

System Info

My environment:

datasets==4.6.1
faiss-cpu==1.13.2
numpy==2.4.2
pyserini==1.5.0
sentence-transformers==5.2.3
torch==2.10.0
torchvision==0.25.0
tqdm==4.67.3
trl==0.29.1
wandb==0.25.1

Who can help?

@zucchini-nlp

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

This is the code I am using:

config = SFTConfig(
        output_dir=checkpoints_path,
        max_steps=num_steps,
        dataset_text_field="text",
        max_length=2048,
        logging_strategy="steps",
        logging_steps=max(1, checkpoint_frequency // 5),
        disable_tqdm=False,
        save_strategy="steps",
        save_steps=checkpoint_frequency,
        eval_strategy="steps",
        eval_steps=checkpoint_frequency,
        load_best_model_at_end=True,
        metric_for_best_model="eval_loss",
        greater_is_better=False,
        report_to="wandb",
        run_name=run_name,
)

trainer = SFTTrainer(
        model=lm.model,
        args=config,
        train_dataset=train_ds,
        eval_dataset=eval_ds,
)
trainer.train()
trainer.save_model()

Expected behavior

Successful SFT. Instead I get the following error with a simple qwen3.5 SFT setup. Running inference only worked fine for me

Traceback (most recent call last):
  File "/workspace/writeable/open-ended-csp/tmp.py", line 41, in <module>
    train_sft(
  File "/workspace/writeable/open-ended-csp/utils/sft_trainer.py", line 52, in train_sft
    trainer.train()
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1412, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1742, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 1338, in training_step
    return super().training_step(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1951, in training_step
    loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 1234, in compute_loss
    (loss, outputs) = super().compute_loss(
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 2022, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 823, in forward
    return model_forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 811, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 841, in wrapper
    output = func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1938, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 841, in wrapper
    output = func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1685, in forward
    position_ids = self.compute_3d_position_ids(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1619, in compute_3d_position_ids
    batch_size, seq_length, _ = inputs_embeds.shape
    ^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 3)
wandb: 
wandb: 🚀 View run test-qwen3.5 at: https://wandb.ai/jj-uoe/open-ended-csp/runs/1sercncq
wandb: Find logs at: wandb/run-20260321_232655-1sercncq/logs

extent analysis

Fix Plan

The error occurs due to a mismatch in the expected shape of inputs_embeds in the compute_3d_position_ids method. To fix this, we need to ensure that inputs_embeds has the correct shape.

Here are the steps to fix the issue:

  • Check the shape of inputs_embeds before passing it to the compute_3d_position_ids method.
  • Modify the compute_3d_position_ids method to handle the correct shape of inputs_embeds.

Example code:

# Check the shape of inputs_embeds
print(inputs_embeds.shape)

# Modify the compute_3d_position_ids method
def compute_3d_position_ids(self, inputs_embeds):
    # Get the shape of inputs_embeds
    batch_size, seq_length, embed_dim = inputs_embeds.shape
    
    # Rest of the method remains the same
    # ...

If the shape of inputs_embeds is not (batch_size, seq_length, embed_dim), you may need to modify the code that generates inputs_embeds to ensure it has the correct shape.

Verification

To verify that the fix worked, run the code again and check that the error is resolved. You can also add print statements or use a debugger to check the shape of inputs_embeds and ensure it is correct.

Extra Tips

  • Make sure to check the documentation for the compute_3d_position_ids method to ensure you understand the expected shape of inputs_embeds.
  • If you are using a pre-trained model, check the model's documentation to ensure you are using it correctly.
  • Consider adding error handling to your code to catch and handle any shape mismatches that may occur.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Successful SFT. Instead I get the following error with a simple qwen3.5 SFT setup. Running inference only worked fine for me

Traceback (most recent call last):
  File "/workspace/writeable/open-ended-csp/tmp.py", line 41, in <module>
    train_sft(
  File "/workspace/writeable/open-ended-csp/utils/sft_trainer.py", line 52, in train_sft
    trainer.train()
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1412, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1742, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 1338, in training_step
    return super().training_step(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 1951, in training_step
    loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/trl/trainer/sft_trainer.py", line 1234, in compute_loss
    (loss, outputs) = super().compute_loss(
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 2022, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 823, in forward
    return model_forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 811, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/amp/autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 841, in wrapper
    output = func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1938, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/utils/generic.py", line 841, in wrapper
    output = func(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1685, in forward
    position_ids = self.compute_3d_position_ids(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/writeable/open-ended-csp/.venv/lib/python3.12/site-packages/transformers/models/qwen3_5/modeling_qwen3_5.py", line 1619, in compute_3d_position_ids
    batch_size, seq_length, _ = inputs_embeds.shape
    ^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 3)
wandb: 
wandb: 🚀 View run test-qwen3.5 at: https://wandb.ai/jj-uoe/open-ended-csp/runs/1sercncq
wandb: Find logs at: wandb/run-20260321_232655-1sercncq/logs

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING