transformers - 💡(How to fix) Fix RT-DETR models do not release memory when deleted / garbage-collected [5 comments, 2 participants]

transformers2026-04-13 15:46:43

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#45412•Fetched 2026-04-15 06:19:39

View on GitHub

Comments

Participants

Timeline

Reactions

Author

dhdaines

Participants

dhdaines

yonigozlan

Timeline (top)

commented ×5subscribed ×5mentioned ×3labeled ×1

Code Example

import gc
import torch
import requests

from PIL import Image
from transformers import AutoModelForObjectDetection, AutoImageProcessor

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

device = torch.device("cuda")
image_processor = AutoImageProcessor.from_pretrained("PekingU/rtdetr_v2_r18vd",
                                                     backend="pil")
model = AutoModelForObjectDetection.from_pretrained("PekingU/rtdetr_v2_r18vd").to(device)
inputs = image_processor(images=image, return_tensors="pt").to(device)

print("before: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))
with torch.inference_mode():
    outputs = model(**inputs)
print("after: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))

del inputs
del outputs
del model
del image_processor
gc.collect()
torch.cuda.empty_cache()
print("gc: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))

---

before: 178
after: 188
gc: 8

---

before: 81
after: 103
gc: 85

RAW_BUFFERClick to expand / collapse

System Info

Transfomers: 5.5.3 PyTorch: 2.8.0+cu126 TorchVision: 0.23.0+cu126 System: Debian 13 (trixie) Python: 3.13.5

Who can help?

@yonigozlan @molbap

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Using slightly modified example scripts from the documentation (e.g. https://huggingface.co/docs/transformers/model_doc/detr?usage=AutoModel), we see that while Deformable DETR releases (nearly all) memory after deleting the model and running GC, RT-DETRv2 (and also RT-DETR) hold onto a significant amount of GPU memory which can never be released. Here is the code snippet in question:

import gc
import torch
import requests

from PIL import Image
from transformers import AutoModelForObjectDetection, AutoImageProcessor

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

device = torch.device("cuda")
image_processor = AutoImageProcessor.from_pretrained("PekingU/rtdetr_v2_r18vd",
                                                     backend="pil")
model = AutoModelForObjectDetection.from_pretrained("PekingU/rtdetr_v2_r18vd").to(device)
inputs = image_processor(images=image, return_tensors="pt").to(device)

print("before: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))
with torch.inference_mode():
    outputs = model(**inputs)
print("after: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))

del inputs
del outputs
del model
del image_processor
gc.collect()
torch.cuda.empty_cache()
print("gc: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))

Expected behavior

Expect the final number to be small, reflecting only the unfreeable (thanks a lot, NVidia!) CUDA context, as it is when using DETR or Deformable-DETR models, for example facebook/detr-resnet-50:

before: 178
after: 188
gc: 8

What actually happens (using the RT-DETR model in the code snippet above):

before: 81
after: 103
gc: 85

extent analysis

TL;DR

The issue is likely due to a memory leak in the RT-DETRv2 model, and a potential workaround is to manually reset the model's internal state after use.

Guidance

The provided code snippet demonstrates a memory leak when using the RT-DETRv2 model, as the GPU memory allocated does not decrease significantly after deleting the model and running garbage collection.
To verify the issue, run the provided code snippet and compare the GPU memory allocated before and after deleting the model and running garbage collection.
To mitigate the issue, try resetting the model's internal state after use by calling model.reset_parameters() or model.zero_grad() before deleting the model.
Consider reporting the issue to the model maintainers or seeking further assistance from the mentioned experts (@yonigozlan @molbap).

Example

# After the model is used
model.zero_grad()
del model
gc.collect()
torch.cuda.empty_cache()

Notes

The cause of the memory leak is unclear and may require further investigation. The provided workaround may not completely resolve the issue but can help mitigate it.

Recommendation

Apply workaround: The workaround of resetting the model's internal state after use may help reduce the memory leak, but it is recommended to report the issue to the model maintainers for a permanent fix.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Expect the final number to be small, reflecting only the unfreeable (thanks a lot, NVidia!) CUDA context, as it is when using DETR or Deformable-DETR models, for example facebook/detr-resnet-50:

before: 178
after: 188
gc: 8

What actually happens (using the RT-DETR model in the code snippet above):

before: 81
after: 103
gc: 85

#memory management #API rate limit #retriever error #indexing error #inference speed

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - 💡(How to fix) Fix RT-DETR models do not release memory when deleted / garbage-collected [5 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

transformers - 💡(How to fix) Fix RT-DETR models do not release memory when deleted / garbage-collected [5 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING