pytorch - 💡(How to fix) Fix CUDA torch.copysign ignores the sign bit of negative float16 NaN [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#181804Fetched 2026-04-29 06:10:56
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Participants

Code Example

import numpy as np, torch

mag = torch.tensor([1.0, 1.0, 1.0], dtype=torch.float16)
sgn = torch.from_numpy(np.array([0x7e00, 0xfe00, 0x3c00], dtype=np.uint16).view(np.float16))

cpu = torch.copysign(mag, sgn)
gpu = torch.copysign(mag.cuda(), sgn.cuda()).cpu()
print("cpu:", cpu)
print("gpu:", gpu)
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

import numpy as np, torch

mag = torch.tensor([1.0, 1.0, 1.0], dtype=torch.float16)
sgn = torch.from_numpy(np.array([0x7e00, 0xfe00, 0x3c00], dtype=np.uint16).view(np.float16))

cpu = torch.copysign(mag, sgn)
gpu = torch.copysign(mag.cuda(), sgn.cuda()).cpu()
print("cpu:", cpu)
print("gpu:", gpu)

cpu: tensor([ 1., -1., 1.], dtype=torch.float16) gpu: tensor([1., 1., 1.], dtype=torch.float16)

Versions

PyTorch 2.11.0+cu128, CUDA 12.8

extent analysis

TL;DR

The issue can be resolved by ensuring consistent behavior of torch.copysign on CPU and GPU for torch.float16 data type.

Guidance

  • Verify that the issue is specific to the torch.float16 data type and does not occur with other data types like torch.float32.
  • Check the PyTorch documentation for any known issues or limitations related to torch.copysign on GPU with torch.float16.
  • Test the code on a different CUDA version or PyTorch version to see if the issue is version-specific.
  • Consider using torch.float32 instead of torch.float16 if possible, to avoid potential precision or implementation issues.

Example

No code example is provided as the issue is more related to the behavior of a specific function on different hardware (CPU vs GPU) rather than a code snippet that can be modified.

Notes

The issue might be related to the implementation details of torch.copysign on GPU for torch.float16, which could be different from the CPU implementation. Further investigation into PyTorch's documentation or source code might be necessary to understand the root cause.

Recommendation

Apply workaround: Consider using torch.float32 instead of torch.float16 if the precision requirements allow for it, as this might avoid the inconsistent behavior observed.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - 💡(How to fix) Fix CUDA torch.copysign ignores the sign bit of negative float16 NaN [1 participants]