pytorch - 💡(How to fix) Fix torch.compile skips alpha=0 validation for in-place torch.celu_

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

import torch

def f(x, alpha): return torch.celu_(x, alpha)

x = torch.tensor([[-2.0, -0.5, 0.0], [1.0, 3.0, -4.0]])

try: f(x.clone(), 0.0) eager = "ok" except Exception as exc: eager = f"{type(exc).name} {str(exc).splitlines()[0]}"

try: compiled_out = torch.compile(f, backend="inductor", dynamic=True)(x.clone(), 0.0) compiled = str(compiled_out) except Exception as exc: compiled = f"{type(exc).name} {str(exc).splitlines()[0]}"

print("x:", x) print("alpha:", 0.0) print("eager:", eager) print("compiled:", compiled)

if "alpha cannot be 0" in eager and "tensor" in compiled: raise SystemExit(0)

raise SystemExit(1)

Root Cause

Eager execution raises a RuntimeError because alpha cannot be zero for CELU. However, the compiled version returns a tensor instead, including nan, instead of raising the same error.

Code Example

import torch

def f(x, alpha):
    return torch.celu_(x, alpha)

x = torch.tensor([[-2.0, -0.5, 0.0], [1.0, 3.0, -4.0]])

try:
    f(x.clone(), 0.0)
    eager = "ok"
except Exception as exc:
    eager = f"{type(exc).__name__} {str(exc).splitlines()[0]}"

try:
    compiled_out = torch.compile(f, backend="inductor", dynamic=True)(x.clone(), 0.0)
    compiled = str(compiled_out)
except Exception as exc:
    compiled = f"{type(exc).__name__} {str(exc).splitlines()[0]}"

print("x:", x)
print("alpha:", 0.0)
print("eager:", eager)
print("compiled:", compiled)

if "alpha cannot be 0" in eager and "tensor" in compiled:
    raise SystemExit(0)

raise SystemExit(1)
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

torch.compile with the Inductor backend does not preserve eager error behavior for torch.celu_ when alpha=0.0.

Eager execution raises a RuntimeError because alpha cannot be zero for CELU. However, the compiled version returns a tensor instead, including nan, instead of raising the same error.

import torch

def f(x, alpha):
    return torch.celu_(x, alpha)

x = torch.tensor([[-2.0, -0.5, 0.0], [1.0, 3.0, -4.0]])

try:
    f(x.clone(), 0.0)
    eager = "ok"
except Exception as exc:
    eager = f"{type(exc).__name__} {str(exc).splitlines()[0]}"

try:
    compiled_out = torch.compile(f, backend="inductor", dynamic=True)(x.clone(), 0.0)
    compiled = str(compiled_out)
except Exception as exc:
    compiled = f"{type(exc).__name__} {str(exc).splitlines()[0]}"

print("x:", x)
print("alpha:", 0.0)
print("eager:", eager)
print("compiled:", compiled)

if "alpha cannot be 0" in eager and "tensor" in compiled:
    raise SystemExit(0)

raise SystemExit(1)

Error logs

x: tensor([[-2.0000, -0.5000, 0.0000], [ 1.0000, 3.0000, -4.0000]]) alpha: 0.0 eager: RuntimeError ZeroDivisionError: alpha cannot be 0 for CELU compiled: tensor([[-0., -0., nan], [1., 3., -0.]])

Versions

PyTorch version: 2.13.0.dev20260513+cpu Is debug build: False CUDA used to build PyTorch: Could not collect ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.3 LTS (x86_64) GCC version: (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0 Clang version: 18.1.3 (1ubuntu1) CMake version: Could not collect Libc version: glibc-2.39

Python version: 3.11.15 (main, Mar 11 2026, 17:20:07) [GCC 14.3.0] (64-bit runtime) Python platform: Linux-6.17.0-20-generic-x86_64-with-glibc2.39 Is CUDA available: False CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: N/A GPU models and configuration: GPU 0: NVIDIA RTX 6000 Ada Generation GPU 1: NVIDIA RTX 6000 Ada Generation

Nvidia driver version: 570.211.01 cuDNN version: Could not collect Is XPU available: False HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Caching allocator config: N/A

Versions of relevant libraries: [pip3] numpy==2.4.4 [pip3] torch==2.13.0.dev20260513+cpu [conda] numpy 2.4.4 pypi_0 pypi [conda] torch 2.13.0.dev20260513+cpu pypi_0 pypi

cc @malfet @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @amjames @jataylo @azahed98

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - 💡(How to fix) Fix torch.compile skips alpha=0 validation for in-place torch.celu_