transformers - ✅(Solved) Fix LwDetrImageLoss crashes when using float16 AMP and Cuda [1 pull requests, 3 comments, 2 participants]

transformers2026-03-19 12:56:22

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#44857•Fetched 2026-04-08 01:03:28

View on GitHub

Comments

Participants

Timeline

Reactions

Author

m-matthias

Participants

m-matthias

sbucaille

Timeline (top)

mentioned ×5subscribed ×5commented ×3closed ×1

Error Message

".../transformers/loss/loss_lw_detr.py", line 139, in loss_labels pos_weights[pos_ind] = t ~~~~~~~~~~~^^^^^^^^^ RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source.

Fix Action

Fixed

Fixed by PR: LwDetrImageLoss: Fix dtype casting to prevent crash when using amp on cuda device (https://github.com/huggingface/transformers/pull/44886)

PR fix notes

PR #44886: LwDetrImageLoss: Fix dtype casting to prevent crash when using amp on cuda device

Repository: huggingface/transformers
Author: m-matthias
State: closed | merged: True
Link: https://github.com/huggingface/transformers/pull/44886

Description (problem / solution / changelog)

What does this PR do?

Prevent crash in class LwDetrImageLoss when using it with float16 automatic mixed precision on a Cuda device. torch.pow causes an autocast to float32 when used with Cuda, which caused a type mismatch at pos_weights[pos_ind] = t. This is fixed by casting back to the original type after the torch.pow operations. See also torch.amp Also added pos_ind = tuple(pos_ind) to fix a deprecation warning.

Fixes 44857

Who can review?

@sbucaille @yonigozlan

Changed files

src/transformers/loss/loss_lw_detr.py (modified, +9/-9)
src/transformers/loss/loss_rt_detr.py (modified, +5/-3)

Code Example

import torch

from transformers import LwDetrForObjectDetection


model = LwDetrForObjectDetection.from_pretrained("AnnaZhang/lwdetr_small_60e_coco")
model.train()
model.cuda()

inputs = {
    "pixel_values": torch.rand((1, 3, 640, 640), device="cuda"),
    "pixel_mask": torch.ones((1, 640, 640), device="cuda"),
    "labels": [
        {
            "class_labels": torch.tensor([1], device="cuda"),
            "boxes": torch.tensor([[0.5, 0.5, 0.1, 0.1]], device="cuda"),
        }
    ],
}

with torch.autocast(device_type="cuda", dtype=torch.float16):
    outputs = model(**inputs)

---

".../transformers/loss/loss_lw_detr.py", line 139, in loss_labels
    pos_weights[pos_ind] = t
    ~~~~~~~~~~~^^^^^^^^^
RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source.

RAW_BUFFERClick to expand / collapse

System Info

transformers version: 5.3.0.dev0
Platform: Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39
Python version: 3.12.3
Huggingface_hub version: 1.7.1
Safetensors version: 0.7.0
Accelerate version: 1.13.0
Accelerate config: not found
DeepSpeed version: not installed
PyTorch version (accelerator?): 2.8.0+cu128 (CUDA)
Using distributed or parallel set-up in script?: no
Using GPU in script?: yes
GPU type: NVIDIA RTX 500 Ada Generation Laptop GPU

Who can help?

@yonigozlan @molbap @sbucaille

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

import torch

from transformers import LwDetrForObjectDetection


model = LwDetrForObjectDetection.from_pretrained("AnnaZhang/lwdetr_small_60e_coco")
model.train()
model.cuda()

inputs = {
    "pixel_values": torch.rand((1, 3, 640, 640), device="cuda"),
    "pixel_mask": torch.ones((1, 640, 640), device="cuda"),
    "labels": [
        {
            "class_labels": torch.tensor([1], device="cuda"),
            "boxes": torch.tensor([[0.5, 0.5, 0.1, 0.1]], device="cuda"),
        }
    ],
}

with torch.autocast(device_type="cuda", dtype=torch.float16):
    outputs = model(**inputs)

This leads to the following error:

".../transformers/loss/loss_lw_detr.py", line 139, in loss_labels
    pos_weights[pos_ind] = t
    ~~~~~~~~~~~^^^^^^^^^
RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source.

It seems the reason for this is that the torch.pow operations inside loss_labels() cause an upcast to float32 on the GPU, leading to a type mismatch.

Expected behavior

Correct handling of float16 AMP on Cuda GPU, without crashing.

extent analysis

Fix Plan

To resolve the issue, we need to ensure that the data types match when performing operations. We can achieve this by casting the pos_weights tensor to torch.float16 before assigning the value.

Here are the steps:

Cast pos_weights to torch.float16 in the loss_labels function.
Alternatively, you can also cast the t value to torch.float16 before assignment.

Example code snippet:

# In loss_lw_detr.py, line 139
pos_weights = pos_weights.to(torch.float16)  # Cast pos_weights to torch.float16
pos_weights[pos_ind] = t

# Alternatively, cast t to torch.float16
# pos_weights[pos_ind] = t.to(torch.float16)

However, since you are using a pre-trained model, modifying the model's code directly might not be feasible. In that case, you can try to cast the inputs to torch.float16 before passing them to the model:

inputs = {
    "pixel_values": torch.rand((1, 3, 640, 640), device="cuda", dtype=torch.float16),
    "pixel_mask": torch.ones((1, 640, 640), device="cuda", dtype=torch.float16),
    "labels": [
        {
            "class_labels": torch.tensor([1], device="cuda", dtype=torch.float16),
            "boxes": torch.tensor([[0.5, 0.5, 0.1, 0.1]], device="cuda", dtype=torch.float16),
        }
    ],
}

Verification

To verify that the fix worked, run your original code with the modified inputs or model. The code should execute without throwing a RuntimeError due to type mismatch.

Extra Tips

When working with mixed precision training, it's essential to ensure that the data types of the tensors match during operations. You can use the to() method to cast tensors to the desired data type. Additionally, be aware of the data types of the inputs and outputs of the model to avoid type mismatches.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Correct handling of float16 AMP on Cuda GPU, without crashing.

#api #ssr #installation #tensor shape #autograd error #retrieval issue #search optimization #API routing #API middleware #SSR setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - ✅(Solved) Fix LwDetrImageLoss crashes when using float16 AMP and Cuda [1 pull requests, 3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #44886: LwDetrImageLoss: Fix dtype casting to prevent crash when using amp on cuda device

Description (problem / solution / changelog)

What does this PR do?

Who can review?

Changed files

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

TRENDING

transformers - ✅(Solved) Fix LwDetrImageLoss crashes when using float16 AMP and Cuda [1 pull requests, 3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #44886: LwDetrImageLoss: Fix dtype casting to prevent crash when using amp on cuda device

Description (problem / solution / changelog)

What does this PR do?

Who can review?

Changed files

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING