pytorch - 💡(How to fix) Fix Large numerical discrepancy in torch.linalg.eigvalsh between CPU and CUDA for discrete-valued inputs

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Code Example

import torch
import torch.nn as nn

torch.manual_seed(0)

fc1 = nn.Linear(8, 8)

def forward(f1, device):
    x = torch.randn(4, 8, device=device)
    x.requires_grad_(True)
    with torch.enable_grad():
        y = f1(x)
        z = torch.sin(y) * torch.cos(y)     # in [-0.5, 0.5]
        w = torch.ceil(z).int().float()     # {-1., 0., 1.}
        v = torch.tan(w)                    # {-1.557, 0., 1.557}
        G = v @ v.T                         # Gram matrix (4×4)
        eig = torch.linalg.eigvalsh(G)
    return eig

fc1_g = nn.Linear(8, 8).cuda()
fc1_g.load_state_dict(fc1.state_dict())

cpu_out = forward(fc1, 'cpu')
gpu_out = forward(fc1_g, 'cuda').cpu()

diff = (cpu_out - gpu_out).abs()

print("CPU eigenvalues:", cpu_out)
print("GPU eigenvalues:", gpu_out)
print("max diff:", diff.max())
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

import torch
import torch.nn as nn

torch.manual_seed(0)

fc1 = nn.Linear(8, 8)

def forward(f1, device):
    x = torch.randn(4, 8, device=device)
    x.requires_grad_(True)
    with torch.enable_grad():
        y = f1(x)
        z = torch.sin(y) * torch.cos(y)     # in [-0.5, 0.5]
        w = torch.ceil(z).int().float()     # {-1., 0., 1.}
        v = torch.tan(w)                    # {-1.557, 0., 1.557}
        G = v @ v.T                         # Gram matrix (4×4)
        eig = torch.linalg.eigvalsh(G)
    return eig

fc1_g = nn.Linear(8, 8).cuda()
fc1_g.load_state_dict(fc1.state_dict())

cpu_out = forward(fc1, 'cpu')
gpu_out = forward(fc1_g, 'cuda').cpu()

diff = (cpu_out - gpu_out).abs()

print("CPU eigenvalues:", cpu_out)
print("GPU eigenvalues:", gpu_out)
print("max diff:", diff.max())

Versions

2.9.1+cu128 (PyTorch 2.9.1, CUDA 12.8)

CPU eigenvalues: tensor([ 1.63, 3.14, 7.48, 14.42]) GPU eigenvalues: tensor([ 1.40, 4.18, 9.84, 25.81]) max diff: ~1.1e+01

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia @jianyuh @nikitaved @mruberry @walterddr @xwang233 @Lezcano

extent analysis

TL;DR

The issue can be addressed by investigating the numerical stability and potential rounding differences between CPU and GPU computations in PyTorch.

Guidance

  • Verify that the difference in eigenvalues is due to the numerical computation differences between CPU and GPU by checking the intermediate results, such as y, z, w, and v.
  • Consider using a more robust method for computing the Gram matrix G and its eigenvalues, such as using a stabilized algorithm or a different numerical library.
  • Check the PyTorch documentation for any known issues or limitations related to numerical computations on GPU.
  • Test the code with different input values and see if the difference in eigenvalues persists.

Example

No code example is provided as the issue is related to numerical computations and requires a deeper understanding of the underlying mathematics.

Notes

The issue may be related to the inherent differences in floating-point arithmetic between CPU and GPU, which can lead to small differences in numerical computations. However, the large difference in eigenvalues (~1.1e+01) suggests that there may be a more significant issue at play.

Recommendation

Apply workaround: The issue is likely related to numerical instability, and using a more robust method for computing the Gram matrix and its eigenvalues may help mitigate the difference in results between CPU and GPU.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING