pytorch - 💡(How to fix) Fix CPU/CUDA inconsistency in signed-zero preservation for ReLU and reduction/comparison operators [1 participants]

pytorch2026-04-27 00:54:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#181533•Fetched 2026-04-27 05:28:47

View on GitHub

Comments

Participants

Timeline

Reactions

Author

beanduan22

Participants

beanduan22

Timeline (top)

mentioned ×11subscribed ×11labeled ×5

Code Example

import torch
import torch.nn.functional as F

x_cpu = torch.tensor([-0.0])
x_gpu = x_cpu.cuda()

y_cpu = F.relu(x_cpu)
y_gpu = F.relu(x_gpu)

print(y_cpu, torch.signbit(y_cpu))
print(y_gpu, torch.signbit(y_gpu))

---

tensor([-0.])                  tensor([True])     
 tensor([0.], device='cuda:0')  tensor([False], device='cuda:0')

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

import torch
import torch.nn.functional as F

x_cpu = torch.tensor([-0.0])
x_gpu = x_cpu.cuda()

y_cpu = F.relu(x_cpu)
y_gpu = F.relu(x_gpu)

print(y_cpu, torch.signbit(y_cpu))
print(y_gpu, torch.signbit(y_gpu))

output

 tensor([-0.])                  tensor([True])     
 tensor([0.], device='cuda:0')  tensor([False], device='cuda:0')

Versions

The divergence is observable at the sign-bit level, not merely in tensor printing:

CPU returns -0.0 and preserves the sign bit.
CUDA returns +0.0 and clears the sign bit.

Although -0.0 and +0.0 compare numerically equal, the sign bit is observable through torch.signbit(), so this is a real cross-backend semantic inconsistency.

PyTorch 2.11.0+cu128, CUDA 12.8

cc @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia

extent analysis

TL;DR

The issue can be mitigated by ensuring consistent handling of the sign bit for zero values across CPU and CUDA backends in PyTorch.

Guidance

Verify the behavior with different PyTorch versions to check if this is a known issue that has been fixed in later releases.
Consider using a workaround that manually handles the sign bit for zero values when transferring tensors between CPU and CUDA.
Check if the specific use case requires preserving the sign bit, and if not, whether the numerical equality of -0.0 and +0.0 is sufficient.
Investigate if there are any existing PyTorch functions or methods that can help maintain consistency in sign bit handling across different backends.

Example

No code example is provided as the issue is more related to the internal behavior of PyTorch rather than a code snippet that can be easily modified.

Notes

This issue seems to be specific to the interaction between PyTorch and CUDA, and the behavior might change with different versions of these libraries. The workaround or fix might depend on the specific requirements of the project regarding sign bit preservation.

Recommendation

Apply workaround: Manually handle the sign bit for zero values when necessary, as the current behavior seems to be a known inconsistency between CPU and CUDA backends in PyTorch.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#cache error #pipeline error #runtime error #dependency conflict #environment setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix CPU/CUDA inconsistency in signed-zero preservation for ReLU and reduction/comparison operators [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

🐛 Describe the bug

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix CPU/CUDA inconsistency in signed-zero preservation for ReLU and reduction/comparison operators [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Code Example

🐛 Describe the bug

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING