transformers - 💡(How to fix) Fix RT-DETR models do not release memory when deleted / garbage-collected [5 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#45412Fetched 2026-04-15 06:19:39
View on GitHub
Comments
5
Participants
2
Timeline
15
Reactions
0
Author
Participants
Timeline (top)
commented ×5subscribed ×5mentioned ×3labeled ×1

Code Example

import gc
import torch
import requests

from PIL import Image
from transformers import AutoModelForObjectDetection, AutoImageProcessor

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

device = torch.device("cuda")
image_processor = AutoImageProcessor.from_pretrained("PekingU/rtdetr_v2_r18vd",
                                                     backend="pil")
model = AutoModelForObjectDetection.from_pretrained("PekingU/rtdetr_v2_r18vd").to(device)
inputs = image_processor(images=image, return_tensors="pt").to(device)

print("before: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))
with torch.inference_mode():
    outputs = model(**inputs)
print("after: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))

del inputs
del outputs
del model
del image_processor
gc.collect()
torch.cuda.empty_cache()
print("gc: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))

---

before: 178
after: 188
gc: 8

---

before: 81
after: 103
gc: 85
RAW_BUFFERClick to expand / collapse

System Info

Transfomers: 5.5.3 PyTorch: 2.8.0+cu126 TorchVision: 0.23.0+cu126 System: Debian 13 (trixie) Python: 3.13.5

Who can help?

@yonigozlan @molbap

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Using slightly modified example scripts from the documentation (e.g. https://huggingface.co/docs/transformers/model_doc/detr?usage=AutoModel), we see that while Deformable DETR releases (nearly all) memory after deleting the model and running GC, RT-DETRv2 (and also RT-DETR) hold onto a significant amount of GPU memory which can never be released. Here is the code snippet in question:

import gc
import torch
import requests

from PIL import Image
from transformers import AutoModelForObjectDetection, AutoImageProcessor

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

device = torch.device("cuda")
image_processor = AutoImageProcessor.from_pretrained("PekingU/rtdetr_v2_r18vd",
                                                     backend="pil")
model = AutoModelForObjectDetection.from_pretrained("PekingU/rtdetr_v2_r18vd").to(device)
inputs = image_processor(images=image, return_tensors="pt").to(device)

print("before: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))
with torch.inference_mode():
    outputs = model(**inputs)
print("after: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))

del inputs
del outputs
del model
del image_processor
gc.collect()
torch.cuda.empty_cache()
print("gc: %d" % (torch.cuda.memory_allocated() / 1024 / 1024))

Expected behavior

Expect the final number to be small, reflecting only the unfreeable (thanks a lot, NVidia!) CUDA context, as it is when using DETR or Deformable-DETR models, for example facebook/detr-resnet-50:

before: 178
after: 188
gc: 8

What actually happens (using the RT-DETR model in the code snippet above):

before: 81
after: 103
gc: 85

extent analysis

TL;DR

The issue is likely due to a memory leak in the RT-DETRv2 model, and a potential workaround is to manually reset the model's internal state after use.

Guidance

  • The provided code snippet demonstrates a memory leak when using the RT-DETRv2 model, as the GPU memory allocated does not decrease significantly after deleting the model and running garbage collection.
  • To verify the issue, run the provided code snippet and compare the GPU memory allocated before and after deleting the model and running garbage collection.
  • To mitigate the issue, try resetting the model's internal state after use by calling model.reset_parameters() or model.zero_grad() before deleting the model.
  • Consider reporting the issue to the model maintainers or seeking further assistance from the mentioned experts (@yonigozlan @molbap).

Example

# After the model is used
model.zero_grad()
del model
gc.collect()
torch.cuda.empty_cache()

Notes

The cause of the memory leak is unclear and may require further investigation. The provided workaround may not completely resolve the issue but can help mitigate it.

Recommendation

Apply workaround: The workaround of resetting the model's internal state after use may help reduce the memory leak, but it is recommended to report the issue to the model maintainers for a permanent fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Expect the final number to be small, reflecting only the unfreeable (thanks a lot, NVidia!) CUDA context, as it is when using DETR or Deformable-DETR models, for example facebook/detr-resnet-50:

before: 178
after: 188
gc: 8

What actually happens (using the RT-DETR model in the code snippet above):

before: 81
after: 103
gc: 85

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING