pytorch - 💡(How to fix) Fix DISABLED test_cudagraph_indexing_ops_scatter_cuda_float32 (main.TestCudagraphIndexingOpsCUDA) [1 comments, 1 participants]

pytorch2026-04-14 19:03:50

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#180365•Fetched 2026-04-16 06:35:02

View on GitHub

Comments

Participants

Timeline

Reactions

Author

pytorch-bot[bot]

Participants

pytorch-bot[bot]

Assignees

BoyuanFeng

Timeline (top)

mentioned ×36subscribed ×36labeled ×5assigned ×1

Error Message

Traceback (most recent call last): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3444, in wrapper method(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3444, in wrapper method(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3443, in wrapper with policy(): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2820, in exit raise RuntimeError(msg) RuntimeError: CUDA driver API confirmed a leak in main.TestCudagraphIndexingOpsCUDA.test_cudagraph_indexing_ops_scatter_cuda_float32! Caching allocator allocated memory was 2048 and is now reported as 4096 on device 0. CUDA driver allocated memory was 373489664 and is now 375586816.

To execute this test, run the following from the base repo dir: PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 python test/inductor/test_cudagraph_trees.py TestCudagraphIndexingOpsCUDA.test_cudagraph_indexing_ops_scatter_cuda_float32

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

Root Cause

This test was disabled because it is failing in CI. See recent examples and the most recent trunk workflow logs.

Code Example

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3444, in wrapper
    method(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3444, in wrapper
    method(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3443, in wrapper
    with policy():
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2820, in __exit__
    raise RuntimeError(msg)
RuntimeError: CUDA driver API confirmed a leak in __main__.TestCudagraphIndexingOpsCUDA.test_cudagraph_indexing_ops_scatter_cuda_float32! Caching allocator allocated memory was 2048 and is now reported as 4096 on device 0. CUDA driver allocated memory was 373489664 and is now 375586816.

To execute this test, run the following from the base repo dir:
    PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 python test/inductor/test_cudagraph_trees.py TestCudagraphIndexingOpsCUDA.test_cudagraph_indexing_ops_scatter_cuda_float32

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

RAW_BUFFERClick to expand / collapse

Platforms: linux, slow

This test was disabled because it is failing in CI. See recent examples and the most recent trunk workflow logs.

Over the past 6 hours, it has been determined flaky in 5 workflow(s) with 5 failures and 5 successes.

Debugging instructions (after clicking on the recent samples link): DO NOT ASSUME THINGS ARE OKAY IF THE CI IS GREEN. We now shield flaky tests from developers so CI will thus be green but it will be harder to parse the logs. To find relevant log snippets:

Click on the workflow logs linked above
Click on the Test step of the job so that it is expanded. Otherwise, the grepping will not work.
Grep for test_cudagraph_indexing_ops_scatter_cuda_float32
There should be several instances run (as flaky tests are rerun in CI) from which you can study the logs.

<details><summary>Sample error message</summary>

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3444, in wrapper
    method(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3444, in wrapper
    method(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3443, in wrapper
    with policy():
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2820, in __exit__
    raise RuntimeError(msg)
RuntimeError: CUDA driver API confirmed a leak in __main__.TestCudagraphIndexingOpsCUDA.test_cudagraph_indexing_ops_scatter_cuda_float32! Caching allocator allocated memory was 2048 and is now reported as 4096 on device 0. CUDA driver allocated memory was 373489664 and is now 375586816.

To execute this test, run the following from the base repo dir:
    PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 python test/inductor/test_cudagraph_trees.py TestCudagraphIndexingOpsCUDA.test_cudagraph_indexing_ops_scatter_cuda_float32

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

</details>

Test file path: inductor/test_cudagraph_trees.py

For all disabled tests (by GitHub issue), see https://hud.pytorch.org/disabled.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

extent analysis

TL;DR

The most likely fix involves addressing the CUDA memory leak in the test_cudagraph_indexing_ops_scatter_cuda_float32 test.

Guidance

Investigate the CUDA memory allocation and deallocation in the test_cudagraph_indexing_ops_scatter_cuda_float32 test to identify the source of the leak.
Run the test with the PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 and PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 environment variables to enable memory leak checking and slow gradient checking.
Study the logs from the workflow runs to understand the pattern of the memory leak and identify potential fixes.
Consider setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 to suppress the error message and focus on debugging the underlying issue.

Example

No specific code snippet is provided, but the test file path inductor/test_cudagraph_trees.py and the test class TestCudagraphIndexingOpsCUDA can be used as a starting point for debugging.

Notes

The issue is specific to the Linux platform and is related to CUDA memory management. The provided error message and debugging instructions suggest that the issue is a memory leak, but the root cause is not immediately clear.

Recommendation

Apply a workaround by re-enabling the test and running it with the suggested environment variables to gather more information about the memory leak. This will allow for further debugging and identification of the root cause.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #memory optimization #batch processing #GPU compatibility #latency issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix DISABLED test_cudagraph_indexing_ops_scatter_cuda_float32 (main.TestCudagraphIndexingOpsCUDA) [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix DISABLED test_cudagraph_indexing_ops_scatter_cuda_float32 (__main__.TestCudagraphIndexingOpsCUDA) [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

pytorch - 💡(How to fix) Fix DISABLED test_cudagraph_indexing_ops_scatter_cuda_float32 (main.TestCudagraphIndexingOpsCUDA) [1 comments, 1 participants]