pytorch - 💡(How to fix) Fix torch.nn.functional.gelu returns NaN for +inf input with approximate="none"

pytorch2026-05-31 20:50:00

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Code Example

import numpy as np
import torch
import torch.nn.functional as F

x = torch.tensor(
    [
        [float("nan"), float("inf"), -float("inf")],
        [-0.026216749101877213, 0.11177391558885574, -0.04212268441915512],
    ],
    dtype=torch.float32,
)

out = F.gelu(x, approximate="none")

print(out)
print("out[0, 1]:", out[0, 1])
print("is inf:", torch.isinf(out[0, 1]).item())
print("is nan:", torch.isnan(out[0, 1]).item())

---

tensor([[    nan,     nan,     nan],
        [-0.0128,  0.0609, -0.0204]])

out[0, 1]: tensor(nan)
is inf: False
is nan: True

---

PyTorch version: 2.12.0.dev20260407+cu128
Is debug build: False
CUDA used to build PyTorch: 12.8
ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.3 LTS (x86_64)
Python version: 3.11.15
Is CUDA available: True
GPU models and configuration:
GPU 0: NVIDIA RTX 6000 Ada Generation
GPU 1: NVIDIA RTX 6000 Ada Generation

Nvidia driver version: 570.211.01
numpy==2.4.4
torch==2.12.0.dev20260407+cu128
torchvision==0.27.0.dev20260407+cu128
torchaudio==2.11.0.dev20260407+cu128
triton==3.7.0+git9c288bc5

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Describe the issue

torch.nn.functional.gelu returns NaN for positive infinity input when approximate="none".

For GELU with the exact formulation, gelu(+inf) should produce +inf, but PyTorch returns NaN.

The issue appears only on the +inf element. Other finite values match the expected output.

Minimal reproducible example

import numpy as np
import torch
import torch.nn.functional as F

x = torch.tensor(
    [
        [float("nan"), float("inf"), -float("inf")],
        [-0.026216749101877213, 0.11177391558885574, -0.04212268441915512],
    ],
    dtype=torch.float32,
)

out = F.gelu(x, approximate="none")

print(out)
print("out[0, 1]:", out[0, 1])
print("is inf:", torch.isinf(out[0, 1]).item())
print("is nan:", torch.isnan(out[0, 1]).item())

Actual result

tensor([[    nan,     nan,     nan],
        [-0.0128,  0.0609, -0.0204]])

out[0, 1]: tensor(nan)
is inf: False
is nan: True

Expected out[0, 1] should be +inf, but actual output is NaN.

Versions

Environment

PyTorch version: 2.12.0.dev20260407+cu128
Is debug build: False
CUDA used to build PyTorch: 12.8
ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.3 LTS (x86_64)
Python version: 3.11.15
Is CUDA available: True
GPU models and configuration:
GPU 0: NVIDIA RTX 6000 Ada Generation
GPU 1: NVIDIA RTX 6000 Ada Generation

Nvidia driver version: 570.211.01
numpy==2.4.4
torch==2.12.0.dev20260407+cu128
torchvision==0.27.0.dev20260407+cu128
torchaudio==2.11.0.dev20260407+cu128
triton==3.7.0+git9c288bc5

cc @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering