pytorch - ✅(Solved) Fix [bug]`torch.linalg.solve_triangular(A, B, *, out=B)`: wrong result with `B.is_neg() == True` [1 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#178379Fetched 2026-04-08 01:30:49
View on GitHub
Comments
2
Participants
2
Timeline
78
Reactions
0
Author
Participants
Assignees
Timeline (top)
mentioned ×33subscribed ×33labeled ×7commented ×2

PR fix notes

PR #178298: torch.linalg.solve_triangular: support for eligible non-dense args

Description (problem / solution / changelog)

Stack from ghstack (oldest at bottom):

  • #177649
  • -> #178298
  • #178167
  • #177648

This PR rewrites the logic behind linalg.solve_triangular to address:

  • memory issues as per https://github.com/pytorch/pytorch/issues/176274
  • Fixes https://github.com/pytorch/pytorch/issues/178379
  • conj never triggers a clone (conditioned on inputs with reasonable strides). conj materialization is deterred and only materializes in the pre-allocated memory
  • A is never cloned unless A.neg() == True && unitriangular == True (conditioned on reasonable strides) -- solve_triangular is not linear in the first argument.
  • out=B never clones B (conditioned on reasonable strides)
  • Similar logic can be transferred to all other solve-like methods

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @aditew01 @csarofeen @jianyuh @mruberry @walterddr @xwang233 @Lezcano

Changed files

  • aten/src/ATen/native/BatchLinearAlgebra.cpp (modified, +213/-156)
  • test/test_linalg.py (modified, +19/-4)

Code Example

In [1]: import torch

In [2]: A=torch.tensor([[ 1.0000,  0.0000,  0.0000,  0.0000,  0.0000],
   ...: 
   ...:         [-0.1264,  1.0000,  0.0000,  0.0000,  0.0000],
   ...: 
   ...:         [ 0.5326,  0.0847,  1.0000,  0.0000,  0.0000],
   ...: 
   ...:         [-0.1383,  0.5537, -0.5058,  1.0000,  0.0000],
   ...: 
   ...:         [-0.6810,  0.5130,  0.6028, -0.8186,  1.0000]], device='cuda:0')

In [3]: B=torch.tensor([[-2.1750,  2.5057,  3.5802],
   ...:         [ 8.5027,  5.9123,  8.4869],
   ...: 
   ...:         [ 3.1403,  5.4633,  6.0700],
   ...:         [ 6.9942, -6.1203, -2.3431],
   ...: 
   ...:         [-3.2609,  8.2124,  8.1955]], device='cuda:0')

In [4]: Bc = B.clone()

In [5]: Bc = Bc.neg()._neg_view()

In [6]: (Bc - B).abs().max()
Out[6]: tensor(0., device='cuda:0')

In [7]: res1 = torch.linalg.solve_triangular(A, B, upper=False, out=B)

In [8]: res2 = torch.linalg.solve_triangular(A, Bc, upper=False, out=Bc)

In [9]: B.is_neg()
Out[9]: False

In [10]: Bc.is_neg()
Out[10]: True

In [11]: res1
Out[11]: 
tensor([[-2.1750,  2.5057,  3.5802],
        [ 8.2278,  6.2290,  8.9394],
        [ 3.6018,  3.6012,  3.4060],
        [ 3.9595, -7.4013, -5.0750],
        [-7.8929, -1.5062, -0.1598]], device='cuda:0')

In [12]: res2
Out[12]: 
tensor([[ 2.1750, -2.5057, -3.5802],
        [-8.2278, -6.2290, -8.9394],
        [-3.6018, -3.6012, -3.4060],
        [-3.9595,  7.4013,  5.0750],
        [ 7.8929,  1.5062,  0.1598]], device='cuda:0')
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

As per title. Will be addressed in https://github.com/pytorch/pytorch/pull/178298.

Here is the repro:

In [1]: import torch

In [2]: A=torch.tensor([[ 1.0000,  0.0000,  0.0000,  0.0000,  0.0000],
   ...: 
   ...:         [-0.1264,  1.0000,  0.0000,  0.0000,  0.0000],
   ...: 
   ...:         [ 0.5326,  0.0847,  1.0000,  0.0000,  0.0000],
   ...: 
   ...:         [-0.1383,  0.5537, -0.5058,  1.0000,  0.0000],
   ...: 
   ...:         [-0.6810,  0.5130,  0.6028, -0.8186,  1.0000]], device='cuda:0')

In [3]: B=torch.tensor([[-2.1750,  2.5057,  3.5802],
   ...:         [ 8.5027,  5.9123,  8.4869],
   ...: 
   ...:         [ 3.1403,  5.4633,  6.0700],
   ...:         [ 6.9942, -6.1203, -2.3431],
   ...: 
   ...:         [-3.2609,  8.2124,  8.1955]], device='cuda:0')

In [4]: Bc = B.clone()

In [5]: Bc = Bc.neg()._neg_view()

In [6]: (Bc - B).abs().max()
Out[6]: tensor(0., device='cuda:0')

In [7]: res1 = torch.linalg.solve_triangular(A, B, upper=False, out=B)

In [8]: res2 = torch.linalg.solve_triangular(A, Bc, upper=False, out=Bc)

In [9]: B.is_neg()
Out[9]: False

In [10]: Bc.is_neg()
Out[10]: True

In [11]: res1
Out[11]: 
tensor([[-2.1750,  2.5057,  3.5802],
        [ 8.2278,  6.2290,  8.9394],
        [ 3.6018,  3.6012,  3.4060],
        [ 3.9595, -7.4013, -5.0750],
        [-7.8929, -1.5062, -0.1598]], device='cuda:0')

In [12]: res2
Out[12]: 
tensor([[ 2.1750, -2.5057, -3.5802],
        [-8.2278, -6.2290, -8.9394],
        [-3.6018, -3.6012, -3.4060],
        [-3.9595,  7.4013,  5.0750],
        [ 7.8929,  1.5062,  0.1598]], device='cuda:0')

The logic in the implementation is cognizant of the neg views, but it is not being tested somehow, at least not in the dedicated test suite.

Versions

Current master.

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia @jianyuh @mruberry @walterddr @xwang233 @Lezcano

extent analysis

Fix Plan

To address the issue, we need to modify the torch.linalg.solve_triangular function to handle the _neg_view correctly.

Here are the steps:

  • Update the torch.linalg.solve_triangular function to check if the input tensor has a _neg_view attribute.
  • If it does, apply the necessary operations to handle the negation.

Example code:

def solve_triangular(A, B, upper=False, out=None):
    # ... existing code ...

    if hasattr(B, '_neg_view'):
        # Handle the negation
        B = B.neg()

    # ... existing code ...

Alternatively, you can also add a test case to the dedicated test suite to ensure that the neg views are being tested correctly.

Verification

To verify that the fix worked, you can run the following test:

A = torch.tensor([[ 1.0000,  0.0000,  0.0000,  0.0000,  0.0000],
                  [-0.1264,  1.0000,  0.0000,  0.0000,  0.0000],
                  [ 0.5326,  0.0847,  1.0000,  0.0000,  0.0000],
                  [-0.1383,  0.5537, -0.5058,  1.0000,  0.0000],
                  [-0.6810,  0.5130,  0.6028, -0.8186,  1.0000]], device='cuda:0')

B = torch.tensor([[-2.1750,  2.5057,  3.5802],
                  [ 8.5027,  5.9123,  8.4869],
                  [ 3.1403,  5.4633,  6.0700],
                  [ 6.9942, -6.1203, -2.3431],
                  [-3.2609,  8.2124,  8.1955]], device='cuda:0')

Bc = B.clone()
Bc = Bc.neg()._neg_view()

res1 = torch.linalg.solve_triangular(A, B, upper=False, out=B)
res2 = torch.linalg.solve_triangular(A, Bc, upper=False, out=Bc)

assert torch.allclose(res1, res2.neg())

This test case checks if the result of solve_triangular with the original tensor B is equal to the negation of the result with the negated tensor Bc.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING