pytorch - 💡(How to fix) Fix complex64 lstsq GPU wrong [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#180158Fetched 2026-04-12 13:23:32
View on GitHub
Comments
1
Participants
2
Timeline
72
Reactions
0
Participants
Timeline (top)
mentioned ×32subscribed ×32labeled ×6closed ×1

Error Message

print(f"CPU error vs reference: {cpu_err:.4e}") print(f"GPU error vs reference: {gpu_err:.4e}") CPU error vs reference: 9.8500e-07 GPU error vs reference: 1.9133e+06

Code Example

import torch
import numpy as np

torch.manual_seed(0)
A = torch.tensor([
    [1.+0j, 2.+0j, 3.+0j],
    [4.+0j, 5.+0j, 6.+0j],
    [7.+0j, 8.+0j, 9.+0j],
], dtype=torch.complex64)

b = (torch.randn(3, 1) + 1j * torch.randn(3, 1)).to(torch.complex64)

ref = np.linalg.lstsq(
    A.numpy().astype(np.complex128),
    b.numpy().astype(np.complex128),
    rcond=None
)[0]

cpu = torch.linalg.lstsq(A, b).solution.numpy()
gpu = torch.linalg.lstsq(A.cuda(), b.cuda()).solution.cpu().numpy()

cpu_err = np.linalg.norm(cpu - ref)
gpu_err = np.linalg.norm(gpu - ref)

print(f"Reference (numpy complex128): {ref.flatten().tolist()}")
print(f"CPU (complex64): {cpu.flatten().tolist()}")
print(f"GPU (complex64): {gpu.flatten().tolist()}")
print(f"CPU error vs reference: {cpu_err:.4e}")
print(f"GPU error vs reference: {gpu_err:.4e}")
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

import torch
import numpy as np

torch.manual_seed(0)
A = torch.tensor([
    [1.+0j, 2.+0j, 3.+0j],
    [4.+0j, 5.+0j, 6.+0j],
    [7.+0j, 8.+0j, 9.+0j],
], dtype=torch.complex64)

b = (torch.randn(3, 1) + 1j * torch.randn(3, 1)).to(torch.complex64)

ref = np.linalg.lstsq(
    A.numpy().astype(np.complex128),
    b.numpy().astype(np.complex128),
    rcond=None
)[0]

cpu = torch.linalg.lstsq(A, b).solution.numpy()
gpu = torch.linalg.lstsq(A.cuda(), b.cuda()).solution.cpu().numpy()

cpu_err = np.linalg.norm(cpu - ref)
gpu_err = np.linalg.norm(gpu - ref)

print(f"Reference (numpy complex128): {ref.flatten().tolist()}")
print(f"CPU (complex64): {cpu.flatten().tolist()}")
print(f"GPU (complex64): {gpu.flatten().tolist()}")
print(f"CPU error vs reference: {cpu_err:.4e}")
print(f"GPU error vs reference: {gpu_err:.4e}")

Versions

CPU (complex64): [(-1.6013622-0.6097595j), (-0.206654-0.1092792j), (1.1880528+0.3912009j)] GPU (complex64): [(-29695.105+780537.875j), (59386.765-1561075.875j), (-29692.281+780537.75j)]

CPU error vs reference: 9.8500e-07 GPU error vs reference: 1.9133e+06

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia @ezyang @anjali411 @dylanbespalko @mruberry @nikitaved @amjames @jianyuh @walterddr @xwang233 @Lezcano

extent analysis

TL;DR

The issue can be resolved by using a higher precision data type, such as torch.complex128, to reduce numerical errors.

Guidance

  • The large error on the GPU suggests a numerical precision issue, potentially due to the use of torch.complex64.
  • Try changing the data type of A and b to torch.complex128 to see if the error decreases.
  • Verify that the GPU and CPU results match more closely after changing the data type.
  • Consider using a more robust method for solving linear systems, such as torch.linalg.solve or torch.linalg.pinv, if possible.

Example

A = torch.tensor([
    [1.+0j, 2.+0j, 3.+0j],
    [4.+0j, 5.+0j, 6.+0j],
    [7.+0j, 8.+0j, 9.+0j],
], dtype=torch.complex128)

b = (torch.randn(3, 1) + 1j * torch.randn(3, 1)).to(torch.complex128)

Notes

The issue may be specific to the GPU architecture or the version of PyTorch being used. Further investigation may be needed to determine the root cause of the error.

Recommendation

Apply workaround: Change the data type to torch.complex128 to reduce numerical errors. This is a simple and effective way to improve the accuracy of the results.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING