pytorch - ✅(Solved) Fix [inductor] cudagraph skip reason not logged for non-CUDA GPU backends [1 pull requests]

pytorch2026-04-21 02:37:33

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

Error logs

N/A — this is a code-inspection issue, not a runtime failure. No traceback is produced because the affected code path silently bumps a counter without logging.

Root Cause

In torch/_inductor/output_code.py, two locations gate the cudagraph-skip diagnostic log on a hardcoded "cuda" in ...device_types check. When compilation targets a non-CUDA GPU backend (XPU, MPS, MTIA), cudagraphs are disabled and counters["inductor"]["cudagraph_skips"] is still incremented, but log_cudagraph_skip_and_bump_counter(...) is never called — the skip reason is dropped on the floor.

Fix Action

Fix / Workaround

On any non-CUDA accelerator — XPU / MPS / MTIA, and also out-of-tree backends registered via the PrivateUse1 dispatch key — when disabled_cudagraphs_reason is set (or cudagraph_fail_reasons is non-empty) and the compiled graph targets such a device, the skip counter increments silently. Users lose visibility into why cudagraph wrapping was skipped, which is a significant diagnostic gap when diagnosing torch.compile performance on non-CUDA accelerators.

PR fix notes

PR #180971: [inductor] Using is_gpu() instead of hardcoded "cuda" check

Repository: pytorch/pytorch
Author: Bhuvan1527
State: open | merged: False
Link: https://github.com/pytorch/pytorch/pull/180971

Description (problem / solution / changelog)

Resolves: https://github.com/pytorch/pytorch/issues/180951

Issue:

In the pytorch/torch/_inductor/output_code.py file, the logging of cudagraph skip reasons happens only when the device_types has "cuda".
This check occurs in three locations in the file, that are fixed with this pr
This hard coded "cuda" check is inconsistent with that of done in pytorch/torch/_inductor/scheduler.py

Fix:

Wherever the hard coded "cuda" check is used, this pr replaces them with is_gpu() api, which also checks for other targets like {mps, xpu, mtia, cuda}

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

Changed files

torch/_inductor/output_code.py (modified, +7/-4)

Code Example

PyTorch version: 2.10.0+cu128
Is debug build: False
CUDA used to build PyTorch: 12.8
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04.3) 11.4.0
Clang version: Could not collect
CMake version: version 3.31.2
Libc version: glibc-2.35

Python version: 3.10.12 (main, Mar  3 2026, 11:56:32) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-6.8.0-106-generic-x86_64-with-glibc2.35
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU: 13th Gen Intel(R) Core(TM) i9-13900H (20 cores)

Versions of relevant libraries:
[pip3] numpy==2.2.6
[pip3] torch==2.10.0+cpu
[pip3] triton==3.6.0

Note: reporter's machine is CPU-only; this is a code-inspection report against
current main (SHA c4ec73b4b52e7c878e3c2522cac61e035ee72520), not a live runtime
repro. Applies to any compile target where `device_types` contains a non-CUDA
GPU (mps/xpu/mtia).

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

The sibling file torch/_inductor/scheduler.py makes the equivalent GPU-or-not decision correctly by using is_gpu() from torch/_inductor/utils.py, which covers cuda | mps | xpu | mtia. The two sites in output_code.py are inconsistent with that pattern.

Inconsistent locations (hardcoded "cuda"):

Consistent counter-example (uses is_gpu()):

https://github.com/pytorch/pytorch/blob/c4ec73b4b52e7c878e3c2522cac61e035ee72520/torch/_inductor/scheduler.py#L7521-L7528

is_gpu / GPU_TYPES definitions:

Effect:

Expected Behavior:

log_cudagraph_skip_and_bump_counter(...) should fire for any GPU device (as determined by is_gpu()), matching the behavior already established in _log_graph_partitions in scheduler.py.

Scope note: This report is intentionally limited to the devices covered by GPU_TYPES in utils.py (cuda | mps | xpu | mtia). Extending cudagraph diagnostics to PrivateUse1-registered accelerators requires broader changes to is_gpu/GPU_TYPES and is a separate concern, not in scope here.

Error logs

N/A — this is a code-inspection issue, not a runtime failure. No traceback is produced because the affected code path silently bumps a counter without logging.

Versions

PyTorch version: 2.10.0+cu128
Is debug build: False
CUDA used to build PyTorch: 12.8
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04.3) 11.4.0
Clang version: Could not collect
CMake version: version 3.31.2
Libc version: glibc-2.35

Python version: 3.10.12 (main, Mar  3 2026, 11:56:32) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-6.8.0-106-generic-x86_64-with-glibc2.35
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU: 13th Gen Intel(R) Core(TM) i9-13900H (20 cores)

Versions of relevant libraries:
[pip3] numpy==2.2.6
[pip3] torch==2.10.0+cpu
[pip3] triton==3.6.0

Note: reporter's machine is CPU-only; this is a code-inspection report against
current main (SHA c4ec73b4b52e7c878e3c2522cac61e035ee72520), not a live runtime
repro. Applies to any compile target where `device_types` contains a non-CUDA
GPU (mps/xpu/mtia).

cc @mcarilli @ezyang @eellison @penguinwu @BoyuanFeng @chauhang @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

extent analysis

TL;DR

Replace the hardcoded "cuda" in ...device_types checks in torch/_inductor/output_code.py with a call to is_gpu() from torch/_inductor/utils.py to ensure consistent handling of non-CUDA GPU backends.

Guidance

Identify the two locations in output_code.py where the hardcoded "cuda" check is used (lines 277-287 and 593-600) and replace them with a call to is_gpu().
Verify that the is_gpu() function correctly identifies non-CUDA GPU devices (XPU, MPS, MTIA) by checking the GPU_TYPES definition in utils.py.
Update the log_cudagraph_skip_and_bump_counter() function to use the is_gpu() check, ensuring that it fires for any GPU device, not just CUDA.
Test the changes with different GPU backends (e.g., XPU, MPS, MTIA) to ensure that the skip counter increments correctly and the skip reason is logged.

Example

from torch._inductor.utils import is_gpu

# Replace hardcoded "cuda" check with is_gpu() call
if is_gpu(device_type):
    log_cudagraph_skip_and_bump_counter(...)

Notes

This fix only addresses the inconsistent handling of non-CUDA GPU backends and does not extend cudagraph diagnostics to PrivateUse1-registered accelerators, which requires broader changes to is_gpu() and GPU_TYPES.

Recommendation

Apply the workaround by replacing the hardcoded "cuda" checks with a call to is_gpu() to ensure consistent handling of non-CUDA GPU backends.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#file not found #serialization error #model compatibility #GPU setup #container setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - ✅(Solved) Fix [inductor] cudagraph skip reason not logged for non-CUDA GPU backends [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error logs

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #180971: [inductor] Using is_gpu() instead of hardcoded "cuda" check

Description (problem / solution / changelog)

Changed files

Code Example

🐛 Describe the bug

Error logs

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - ✅(Solved) Fix [inductor] cudagraph skip reason not logged for non-CUDA GPU backends [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error logs

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #180971: [inductor] Using is_gpu() instead of hardcoded "cuda" check

Description (problem / solution / changelog)

Changed files

Code Example

🐛 Describe the bug

Error logs

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING