pytorch - 💡(How to fix) Fix complex64 cumsum GPU 10–16x less accurate [1 participants]

pytorch2026-04-12 00:14:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#180152•Fetched 2026-04-12 13:23:41

View on GitHub

Comments

Participants

Timeline

126

Reactions

Author

beanduan22

Participants

beanduan22

Timeline (top)

mentioned ×60subscribed ×60labeled ×6

Error Message

print(f"GPU/CPU error ratio: {ratio:.1f}x (GPU less accurate)")

Code Example

import torch
import numpy as np

torch.manual_seed(0)
n = 5_000_000
x = (torch.randn(n, dtype=torch.float32) + 1j * torch.randn(n, dtype=torch.float32)).to(torch.complex64)

ref = np.cumsum(x.numpy().astype(np.complex128))
cpu = torch.cumsum(x, dim=0).numpy()
gpu = torch.cumsum(x.cuda(), dim=0).cpu().numpy()

cpu_err = float(np.max(np.abs(cpu.astype(np.complex128) - ref)))
gpu_err = float(np.max(np.abs(gpu.astype(np.complex128) - ref)))
ratio = gpu_err / cpu_err if cpu_err > 0 else float('inf')

print(f"cpu_max_err = {cpu_err:.4e}")
print(f"gpu_max_err = {gpu_err:.4e}")
print(f"GPU/CPU error ratio: {ratio:.1f}x  (GPU less accurate)")

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

import torch
import numpy as np

torch.manual_seed(0)
n = 5_000_000
x = (torch.randn(n, dtype=torch.float32) + 1j * torch.randn(n, dtype=torch.float32)).to(torch.complex64)

ref = np.cumsum(x.numpy().astype(np.complex128))
cpu = torch.cumsum(x, dim=0).numpy()
gpu = torch.cumsum(x.cuda(), dim=0).cpu().numpy()

cpu_err = float(np.max(np.abs(cpu.astype(np.complex128) - ref)))
gpu_err = float(np.max(np.abs(gpu.astype(np.complex128) - ref)))
ratio = gpu_err / cpu_err if cpu_err > 0 else float('inf')

print(f"cpu_max_err = {cpu_err:.4e}")
print(f"gpu_max_err = {gpu_err:.4e}")
print(f"GPU/CPU error ratio: {ratio:.1f}x  (GPU less accurate)")

Versions

version 2.9.0 cpu_max_err = 1.3636e-04 gpu_max_err = 1.5652e-03

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia @ezyang @anjali411 @dylanbespalko @mruberry @nikitaved @amjames

extent analysis

TL;DR

The issue can be mitigated by using a higher precision data type, such as torch.complex128, to reduce the numerical error in the cumulative sum calculation.

Guidance

The error ratio between GPU and CPU calculations suggests a potential issue with numerical precision on the GPU.
Using torch.complex64 may be causing the error due to its limited precision.
Consider using torch.complex128 to increase the precision of the calculations.
Verify the results by comparing the error ratios with the new data type.

Example

x = (torch.randn(n, dtype=torch.float64) + 1j * torch.randn(n, dtype=torch.float64)).to(torch.complex128)

Notes

The issue may be specific to the version 2.9.0 of the library, and the fix may not be applicable to other versions.

Recommendation

Apply workaround: using torch.complex128 instead of torch.complex64 to increase the precision of the calculations, as this is a simple and non-invasive change that can help mitigate the issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix complex64 cumsum GPU 10–16x less accurate [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix complex64 cumsum GPU 10–16x less accurate [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING