pytorch - 💡(How to fix) Fix AArch64 Unit Test Failure - test_aot_autograd_symbolic_exhaustive_linalg_pinv_hermitian_cpu_float32 [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#177264Fetched 2026-04-08 00:42:26
View on GitHub
Comments
2
Participants
3
Timeline
231
Reactions
0
Timeline (top)
subscribed ×101mentioned ×100referenced ×15labeled ×9

Error Message

Traceback (most recent call last): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 164, in check assert_equals_fn(compiled_grad, orig_grad) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual raise error_metas.pop()[0].to_error( # type: ignore[index] AssertionError: Tensor-likes are not close!

Mismatched elements: 2 / 50 (4.0%) Greatest absolute difference: 0.00048828125 at index (0, 2, 4) (up to 1e-05 allowed) Greatest relative difference: 1.74224408056034e-06 at index (1, 2, 2) (up to 1.3e-06 allowed)

The failure occurred for item [0]

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1165, in test_wrapper return test(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/mock.py", line 1379, in patched return func(*newargs, **newkeywargs) File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8894, in test_aot_autograd_symbolic_exhaustive _test_aot_autograd_helper(self, device, dtype, op, dynamic=True) File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8785, in _test_aot_autograd_helper aot_autograd_check( File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 99, in aot_autograd_check _test_aot_autograd_forwards_backwards_helper( File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 176, in _test_aot_autograd_forwards_backwards_helper check(args, ignore_failure=True) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 166, in check raise type(e)(msg) from e AssertionError: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3370, in wrapper method(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 430, in instantiated_test result = test(self, **param_kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1245, in dep_fn return fn(slf, *args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1245, in dep_fn return fn(slf, *args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1766, in wrapper fn(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1177, in test_wrapper raise e_tracked from e Exception: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

Caused by sample input at index 4: SampleInput(input=Tensor[size=(2, 5, 5), device="cpu", dtype=torch.float32], args=(), kwargs={'hermitian': 'True'}, broadcasts_input=False, name='')

To execute this test, run the following from the base repo dir: PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=4 python test/functorch/test_aotdispatch.py TestEagerFusionOpInfoCPU.test_aot_autograd_symbolic_exhaustive_linalg_pinv_hermitian_cpu_float32

Root Cause

Traceback (most recent call last): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1165, in test_wrapper return test(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/mock.py", line 1379, in patched return func(*newargs, **newkeywargs) File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8894, in test_aot_autograd_symbolic_exhaustive _test_aot_autograd_helper(self, device, dtype, op, dynamic=True) File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8785, in _test_aot_autograd_helper aot_autograd_check( File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 99, in aot_autograd_check _test_aot_autograd_forwards_backwards_helper( File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 176, in _test_aot_autograd_forwards_backwards_helper check(args, ignore_failure=True) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 166, in check raise type(e)(msg) from e AssertionError: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

Fix Action

Fix / Workaround

Test failure functorch/test_aotdispatch.py test_aot_autograd_symbolic_exhaustive_linalg_pinv_hermitian_cpu_float32

Traceback (most recent call last): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1165, in test_wrapper return test(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/mock.py", line 1379, in patched return func(*newargs, **newkeywargs) File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8894, in test_aot_autograd_symbolic_exhaustive _test_aot_autograd_helper(self, device, dtype, op, dynamic=True) File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8785, in _test_aot_autograd_helper aot_autograd_check( File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 99, in aot_autograd_check _test_aot_autograd_forwards_backwards_helper( File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 176, in _test_aot_autograd_forwards_backwards_helper check(args, ignore_failure=True) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 166, in check raise type(e)(msg) from e AssertionError: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

Traceback (most recent call last): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3370, in wrapper method(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 430, in instantiated_test result = test(self, **param_kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1245, in dep_fn return fn(slf, *args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1245, in dep_fn return fn(slf, *args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1766, in wrapper fn(*args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1177, in test_wrapper raise e_tracked from e Exception: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

Code Example

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 164, in check
    assert_equals_fn(compiled_grad, orig_grad)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual
    raise error_metas.pop()[0].to_error(  # type: ignore[index]
AssertionError: Tensor-likes are not close!

Mismatched elements: 2 / 50 (4.0%)
Greatest absolute difference: 0.00048828125 at index (0, 2, 4) (up to 1e-05 allowed)
Greatest relative difference: 1.74224408056034e-06 at index (1, 2, 2) (up to 1.3e-06 allowed)

The failure occurred for item [0]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1165, in test_wrapper
    return test(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/mock.py", line 1379, in patched
    return func(*newargs, **newkeywargs)
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8894, in test_aot_autograd_symbolic_exhaustive
    _test_aot_autograd_helper(self, device, dtype, op, dynamic=True)
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8785, in _test_aot_autograd_helper
    aot_autograd_check(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 99, in aot_autograd_check
    _test_aot_autograd_forwards_backwards_helper(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 176, in _test_aot_autograd_forwards_backwards_helper
    check(args, ignore_failure=True)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 166, in check
    raise type(e)(msg) from e
AssertionError: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3370, in wrapper
    method(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 430, in instantiated_test
    result = test(self, **param_kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1245, in dep_fn
    return fn(slf, *args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1245, in dep_fn
    return fn(slf, *args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1766, in wrapper
    fn(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1177, in test_wrapper
    raise e_tracked from e
Exception: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

Caused by sample input at index 4: SampleInput(input=Tensor[size=(2, 5, 5), device="cpu", dtype=torch.float32], args=(), kwargs={'hermitian': 'True'}, broadcasts_input=False, name='')

To execute this test, run the following from the base repo dir:
    PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=4 python test/functorch/test_aotdispatch.py TestEagerFusionOpInfoCPU.test_aot_autograd_symbolic_exhaustive_linalg_pinv_hermitian_cpu_float32
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Test failure functorch/test_aotdispatch.py test_aot_autograd_symbolic_exhaustive_linalg_pinv_hermitian_cpu_float32

Stacktrace

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 164, in check
    assert_equals_fn(compiled_grad, orig_grad)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual
    raise error_metas.pop()[0].to_error(  # type: ignore[index]
AssertionError: Tensor-likes are not close!

Mismatched elements: 2 / 50 (4.0%)
Greatest absolute difference: 0.00048828125 at index (0, 2, 4) (up to 1e-05 allowed)
Greatest relative difference: 1.74224408056034e-06 at index (1, 2, 2) (up to 1.3e-06 allowed)

The failure occurred for item [0]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1165, in test_wrapper
    return test(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/mock.py", line 1379, in patched
    return func(*newargs, **newkeywargs)
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8894, in test_aot_autograd_symbolic_exhaustive
    _test_aot_autograd_helper(self, device, dtype, op, dynamic=True)
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/functorch/test_aotdispatch.py", line 8785, in _test_aot_autograd_helper
    aot_autograd_check(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 99, in aot_autograd_check
    _test_aot_autograd_forwards_backwards_helper(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 176, in _test_aot_autograd_forwards_backwards_helper
    check(args, ignore_failure=True)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/aot_autograd.py", line 166, in check
    raise type(e)(msg) from e
AssertionError: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3370, in wrapper
    method(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 430, in instantiated_test
    result = test(self, **param_kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1245, in dep_fn
    return fn(slf, *args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1245, in dep_fn
    return fn(slf, *args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1766, in wrapper
    fn(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1177, in test_wrapper
    raise e_tracked from e
Exception: Gradients of the operator are different in eager-mode PyTorch vs AOTDispatcher. This means the operator will have incorrect gradients underneath torch.compile. This could be because the operator's backward is incorrectly registered or not traceable.

Caused by sample input at index 4: SampleInput(input=Tensor[size=(2, 5, 5), device="cpu", dtype=torch.float32], args=(), kwargs={'hermitian': 'True'}, broadcasts_input=False, name='')

To execute this test, run the following from the base repo dir:
    PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=4 python test/functorch/test_aotdispatch.py TestEagerFusionOpInfoCPU.test_aot_autograd_symbolic_exhaustive_linalg_pinv_hermitian_cpu_float32

Affects Neoverse-V1

Versions

Commit - https://github.com/pytorch/pytorch/commit/08b6f48d871affbc7abe9277020aed882fdf110a

cc @ezyang @albanD @gqchen @nikitaved @soulitzer @Varal7 @bobrenjc93 @jianyuh @mruberry @walterddr @xwang233 @Lezcano @snadampal @milpuz01 @aditew01 @nikhil-arm @fadara01 @nWEIdia @chauhang @penguinwu

extent analysis

Fix Plan

The fix involves updating the torch.linalg.pinv function to correctly handle hermitian matrices.

  • Update the torch.linalg.pinv function to use the torch.linalg.inv function with the hermitian argument.
  • Add a check to ensure that the input matrix is hermitian before applying the torch.linalg.inv function.

Example code:

import torch

def pinv(input, hermitian=False):
    if hermitian:
        # Check if the input matrix is hermitian
        if not torch.allclose(input, input.conj().T):
            raise ValueError("Input matrix is not hermitian")
        # Use torch.linalg.inv with hermitian argument
        return torch.linalg.inv(input)
    else:
        # Use the original torch.linalg.pinv function
        return torch.linalg.pinv(input)

Verification

To verify the fix, run the test case again with the updated torch.linalg.pinv function.

  • Run the test case using the command: PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=4 python test/functorch/test_aotdispatch.py TestEagerFusionOpInfoCPU.test_aot_autograd_symbolic_exhaustive_linalg_pinv_hermitian_cpu_float32
  • Check that the test case passes without any errors.

Extra Tips

  • Make sure to update the torch.linalg.pinv function in the correct location, which is likely in the torch.linalg module.
  • Consider adding additional test cases to ensure that the torch.linalg.pinv function works correctly for both hermitian and non-hermitian matrices.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING