pytorch - 💡(How to fix) Fix Can't allocate 15+GB on 24GB GPU (RTX 3090) (reason: fragmented usage from other apps) [3 comments, 2 participants]

pytorch2026-03-21 16:15:57

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#178057•Fetched 2026-04-08 01:12:24

View on GitHub

Comments

Participants

Timeline

Reactions

Author

MrMarvel

Participants

eqy

MrMarvel

Timeline (top)

mentioned ×12subscribed ×12labeled ×6commented ×3

Error Message

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 14.90 GiB. GPU 0 has a total capacity of 24.00 GiB of which 22.79 GiB is free. Of the allocated memory 0 bytes is allocated by PyTorch, and 0 bytes is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Code Example

torch.OutOfMemoryError: CUDA out of memory.
Tried to allocate 14.90 GiB. GPU 0 has a total capacity of 24.00 GiB of which 22.79 GiB is free.
Of the allocated memory 0 bytes is allocated by PyTorch, and 0 bytes is reserved by PyTorch but unallocated.
If reserved but unallocated memory is large try setting PYTORCH_ALLOC_CONF=expandable_segments:True to avoid fragmentation. 
See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

---

import time
import torch


def main():
    device = torch.accelerator.current_accelerator(True)
    matrix_8GB = torch.randn((round(14.9 * 1024) * (1024 // 4) * 1024), device=device, dtype=torch.float32)
    # CRASHING
    # Get the current GPU memory usage
    current_memory = torch.accelerator.memory_allocated()
    print(f"Current GPU memory allocated: {current_memory / (1024 ** 3):.2f} GB")
    
    # Get the maximum GPU memory allocated during the program's execution
    max_memory = torch.accelerator.max_memory_allocated()
    print(f"Maximum GPU memory allocated: {max_memory / (1024 ** 3):.2f} GB")
    time.sleep(10)


if __name__ == "__main__":
    main()

---

PyTorch version: 2.10.0+cu130
Is debug build: False
CUDA used to build PyTorch: 13.0
ROCM used to build PyTorch: N/A

OS: D:\Users\Sergey\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1
{
    "Caption":  "Microsoft Windows 11 Enterprise",
    "OSArchitecture":  "64-bit",
    "Version":  "10.0.26200"
}
Expecting value: line 1 column 1 (char 0)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.11.15 (main, Mar  3 2026, 14:55:34) [MSC v.1944 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.26200-SP0
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: 
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090
Nvidia driver version: 595.79
cuDNN version: Could not collect
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Caching allocator config: N/A

CPU:
D:\Users\Sergey\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1
{
    "Name":  "AMD Ryzen 9 9900X 12-Core Processor            ",
    "Manufacturer":  "AuthenticAMD",
    "Family":  107,
    "Architecture":  9,
    "ProcessorType":  3,
    "DeviceID":  "CPU0",
    "CurrentClockSpeed":  4400,
    "MaxClockSpeed":  4400,
    "L2CacheSize":  12288,
    "L2CacheSpeed":  null,
    "Revision":  17408
}
Expecting value: line 1 column 1 (char 0)

Versions of relevant libraries:
[pip3] Could not collect
[conda] Could not collect

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

torch.OutOfMemoryError: CUDA out of memory.
Tried to allocate 14.90 GiB. GPU 0 has a total capacity of 24.00 GiB of which 22.79 GiB is free.
Of the allocated memory 0 bytes is allocated by PyTorch, and 0 bytes is reserved by PyTorch but unallocated.
If reserved but unallocated memory is large try setting PYTORCH_ALLOC_CONF=expandable_segments:True to avoid fragmentation. 
See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Code:

import time
import torch


def main():
    device = torch.accelerator.current_accelerator(True)
    matrix_8GB = torch.randn((round(14.9 * 1024) * (1024 // 4) * 1024), device=device, dtype=torch.float32)
    # CRASHING
    # Get the current GPU memory usage
    current_memory = torch.accelerator.memory_allocated()
    print(f"Current GPU memory allocated: {current_memory / (1024 ** 3):.2f} GB")
    
    # Get the maximum GPU memory allocated during the program's execution
    max_memory = torch.accelerator.max_memory_allocated()
    print(f"Maximum GPU memory allocated: {max_memory / (1024 ** 3):.2f} GB")
    time.sleep(10)


if __name__ == "__main__":
    main()

Versions

PyTorch version: 2.10.0+cu130
Is debug build: False
CUDA used to build PyTorch: 13.0
ROCM used to build PyTorch: N/A

OS: D:\Users\Sergey\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1
{
    "Caption":  "Microsoft Windows 11 Enterprise",
    "OSArchitecture":  "64-bit",
    "Version":  "10.0.26200"
}
Expecting value: line 1 column 1 (char 0)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.11.15 (main, Mar  3 2026, 14:55:34) [MSC v.1944 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.26200-SP0
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: 
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090
Nvidia driver version: 595.79
cuDNN version: Could not collect
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Caching allocator config: N/A

CPU:
D:\Users\Sergey\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1
{
    "Name":  "AMD Ryzen 9 9900X 12-Core Processor            ",
    "Manufacturer":  "AuthenticAMD",
    "Family":  107,
    "Architecture":  9,
    "ProcessorType":  3,
    "DeviceID":  "CPU0",
    "CurrentClockSpeed":  4400,
    "MaxClockSpeed":  4400,
    "L2CacheSize":  12288,
    "L2CacheSpeed":  null,
    "Revision":  17408
}
Expecting value: line 1 column 1 (char 0)

Versions of relevant libraries:
[pip3] Could not collect
[conda] Could not collect

cc @peterjc123 @mszhanyi @skyline75489 @nbcsm @iremyux @Blackhex @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia

extent analysis

Fix Plan

To resolve the torch.OutOfMemoryError: CUDA out of memory issue, we need to optimize memory allocation and deallocation in the PyTorch code.

Here are the steps:

Set the PYTORCH_ALLOC_CONF environment variable to expandable_segments:True to avoid memory fragmentation.
Use torch.cuda.empty_cache() to release unused GPU memory.
Consider reducing the size of the tensor or using a more memory-efficient data type.

Code Changes

import os
import torch
import time

# Set environment variable to avoid memory fragmentation
os.environ['PYTORCH_ALLOC_CONF'] = 'expandable_segments:True'

def main():
    device = torch.accelerator.current_accelerator(True)
    # Reduce the size of the tensor to avoid out-of-memory error
    matrix_8GB = torch.randn((round(10 * 1024) * (1024 // 4) * 1024), device=device, dtype=torch.float32)
    
    # Get the current GPU memory usage
    current_memory = torch.accelerator.memory_allocated()
    print(f"Current GPU memory allocated: {current_memory / (1024 ** 3):.2f} GB")
    
    # Get the maximum GPU memory allocated during the program's execution
    max_memory = torch.accelerator.max_memory_allocated()
    print(f"Maximum GPU memory allocated: {max_memory / (1024 ** 3):.2f} GB")
    
    # Release unused GPU memory
    torch.cuda.empty_cache()
    
    time.sleep(10)

if __name__ == "__main__":
    main()

Verification

To verify that the fix worked, run the modified code and check that it no longer throws a torch.OutOfMemoryError. You can also monitor the GPU memory usage using tools like nvidia-smi to ensure that the memory allocation and deallocation are working as expected.

Extra Tips

Always set the PYTORCH_ALLOC_CONF environment variable to expandable_segments:True when working with large tensors to avoid memory fragmentation.
Use torch.cuda.empty_cache() regularly to release unused GPU memory and prevent memory leaks.
Consider using more memory-efficient data types, such as torch.float16 or torch.bfloat16, to reduce memory usage.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#memory management #generation error #database connection #vector store #embedding generation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix Can't allocate 15+GB on 24GB GPU (RTX 3090) (reason: fragmented usage from other apps) [3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

Versions

extent analysis

Fix Plan

Code Changes

Verification

Extra Tips

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix Can't allocate 15+GB on 24GB GPU (RTX 3090) (reason: fragmented usage from other apps) [3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

Versions

extent analysis

Fix Plan

Code Changes

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING