pytorch - 💡(How to fix) Fix Large numerical discrepancy in torch.linalg.eigvalsh between CPU and CUDA for discrete-valued inputs

pytorch2026-04-09 03:33:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Code Example

import torch
import torch.nn as nn

torch.manual_seed(0)

fc1 = nn.Linear(8, 8)

def forward(f1, device):
    x = torch.randn(4, 8, device=device)
    x.requires_grad_(True)
    with torch.enable_grad():
        y = f1(x)
        z = torch.sin(y) * torch.cos(y)     # in [-0.5, 0.5]
        w = torch.ceil(z).int().float()     # {-1., 0., 1.}
        v = torch.tan(w)                    # {-1.557, 0., 1.557}
        G = v @ v.T                         # Gram matrix (4×4)
        eig = torch.linalg.eigvalsh(G)
    return eig

fc1_g = nn.Linear(8, 8).cuda()
fc1_g.load_state_dict(fc1.state_dict())

cpu_out = forward(fc1, 'cpu')
gpu_out = forward(fc1_g, 'cuda').cpu()

diff = (cpu_out - gpu_out).abs()

print("CPU eigenvalues:", cpu_out)
print("GPU eigenvalues:", gpu_out)
print("max diff:", diff.max())

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

import torch
import torch.nn as nn

torch.manual_seed(0)

fc1 = nn.Linear(8, 8)

def forward(f1, device):
    x = torch.randn(4, 8, device=device)
    x.requires_grad_(True)
    with torch.enable_grad():
        y = f1(x)
        z = torch.sin(y) * torch.cos(y)     # in [-0.5, 0.5]
        w = torch.ceil(z).int().float()     # {-1., 0., 1.}
        v = torch.tan(w)                    # {-1.557, 0., 1.557}
        G = v @ v.T                         # Gram matrix (4×4)
        eig = torch.linalg.eigvalsh(G)
    return eig

fc1_g = nn.Linear(8, 8).cuda()
fc1_g.load_state_dict(fc1.state_dict())

cpu_out = forward(fc1, 'cpu')
gpu_out = forward(fc1_g, 'cuda').cpu()

diff = (cpu_out - gpu_out).abs()

print("CPU eigenvalues:", cpu_out)
print("GPU eigenvalues:", gpu_out)
print("max diff:", diff.max())

Versions

2.9.1+cu128 (PyTorch 2.9.1, CUDA 12.8)

CPU eigenvalues: tensor([ 1.63, 3.14, 7.48, 14.42]) GPU eigenvalues: tensor([ 1.40, 4.18, 9.84, 25.81]) max diff: ~1.1e+01

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia @jianyuh @nikitaved @mruberry @walterddr @xwang233 @Lezcano

extent analysis

TL;DR

The issue can be addressed by investigating the numerical stability and potential rounding differences between CPU and GPU computations in PyTorch.

Guidance

Verify that the difference in eigenvalues is due to the numerical computation differences between CPU and GPU by checking the intermediate results, such as y, z, w, and v.
Consider using a more robust method for computing the Gram matrix G and its eigenvalues, such as using a stabilized algorithm or a different numerical library.
Check the PyTorch documentation for any known issues or limitations related to numerical computations on GPU.
Test the code with different input values and see if the difference in eigenvalues persists.

Example

No code example is provided as the issue is related to numerical computations and requires a deeper understanding of the underlying mathematics.

Notes

The issue may be related to the inherent differences in floating-point arithmetic between CPU and GPU, which can lead to small differences in numerical computations. However, the large difference in eigenvalues (~1.1e+01) suggests that there may be a more significant issue at play.

Recommendation

Apply workaround: The issue is likely related to numerical instability, and using a more robust method for computing the Gram matrix and its eigenvalues may help mitigate the difference in results between CPU and GPU.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#tool integration #LLM response #prompt template #agent execution #callback error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix Large numerical discrepancy in torch.linalg.eigvalsh between CPU and CUDA for discrete-valued inputs

Recommended Tools

GitHub issue graph ai analysis

Code Example

🐛 Describe the bug

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix Large numerical discrepancy in torch.linalg.eigvalsh between CPU and CUDA for discrete-valued inputs

Recommended Tools

GitHub issue graph ai analysis

Code Example

🐛 Describe the bug

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING