pytorch - 💡(How to fix) Fix AArch64 Unit Test Failure - inductor/test_aot_inductor_arrayref.py test_deconv_freezing_cpu_with_stack_allocation [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#177243Fetched 2026-04-08 00:42:59
View on GitHub
Comments
2
Participants
3
Timeline
188
Reactions
0
Timeline (top)
mentioned ×80subscribed ×80referenced ×15labeled ×10

Error Message

Traceback (most recent call last): File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/inductor/test_aot_inductor.py", line 910, in test_deconv_freezing self.check_model(Model(self.device), example_inputs) File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/inductor/test_aot_inductor_utils.py", line 259, in check_model self.assertEqual(actual, expected, atol=atol, rtol=rtol) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/test_case.py", line 113, in assertEqual return super().assertEqual(x, y, *args, **kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual raise error_metas.pop()[0].to_error( # type: ignore[index] AssertionError: Tensor-likes are not close!

Mismatched elements: 9 / 32 (28.1%) Greatest absolute difference: 4.3125 at index (0, 1, 1, 1) (up to 1e-05 allowed) Greatest relative difference: 3.390625 at index (0, 1, 3, 0) (up to 0.016 allowed)

To execute this test, run the following from the base repo dir: python test/inductor/test_aot_inductor_arrayref.py AOTInductorTestABICompatibleCpuWithStackAllocation.test_deconv_freezing_cpu_with_stack_allocation

Code Example

Traceback (most recent call last):
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/inductor/test_aot_inductor.py", line 910, in test_deconv_freezing
    self.check_model(Model(self.device), example_inputs)
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/inductor/test_aot_inductor_utils.py", line 259, in check_model
    self.assertEqual(actual, expected, atol=atol, rtol=rtol)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/test_case.py", line 113, in assertEqual
    return super().assertEqual(x, y, *args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual
    raise error_metas.pop()[0].to_error(  # type: ignore[index]
AssertionError: Tensor-likes are not close!

Mismatched elements: 9 / 32 (28.1%)
Greatest absolute difference: 4.3125 at index (0, 1, 1, 1) (up to 1e-05 allowed)
Greatest relative difference: 3.390625 at index (0, 1, 3, 0) (up to 0.016 allowed)

To execute this test, run the following from the base repo dir:
    python test/inductor/test_aot_inductor_arrayref.py AOTInductorTestABICompatibleCpuWithStackAllocation.test_deconv_freezing_cpu_with_stack_allocation
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Creating this issue in order to mark test as XFAIL

Traceback below

Traceback (most recent call last):
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/inductor/test_aot_inductor.py", line 910, in test_deconv_freezing
    self.check_model(Model(self.device), example_inputs)
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/inductor/test_aot_inductor_utils.py", line 259, in check_model
    self.assertEqual(actual, expected, atol=atol, rtol=rtol)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/test_case.py", line 113, in assertEqual
    return super().assertEqual(x, y, *args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual
    raise error_metas.pop()[0].to_error(  # type: ignore[index]
AssertionError: Tensor-likes are not close!

Mismatched elements: 9 / 32 (28.1%)
Greatest absolute difference: 4.3125 at index (0, 1, 1, 1) (up to 1e-05 allowed)
Greatest relative difference: 3.390625 at index (0, 1, 3, 0) (up to 0.016 allowed)

To execute this test, run the following from the base repo dir:
    python test/inductor/test_aot_inductor_arrayref.py AOTInductorTestABICompatibleCpuWithStackAllocation.test_deconv_freezing_cpu_with_stack_allocation

Occurs on Neoverse-V1

Versions

Commit - 08b6f48d871affbc7abe9277020aed882fdf110a

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @aditew01 @snadampal @milpuz01 @nikhil-arm @fadara01 @nWEIdia @chauhang @penguinwu @avikchaudhuri @zhxchen17 @tugsbayasgalan @angelayi @suo @ydwu4 @desertfire @yushangdi @benjaminglass1 @jataylo @iupaikov-amd

extent analysis

Fix Plan

The fix involves updating the tolerance values in the assertEqual method to account for the differences in floating-point calculations on the Neoverse-V1 architecture.

  • Update the atol and rtol values in the check_model function:
self.assertEqual(actual, expected, atol=1e-4, rtol=1e-3)
  • Alternatively, consider using a more robust comparison method, such as torch.allclose:
self.assertTrue(torch.allclose(actual, expected, atol=1e-4, rtol=1e-3))

Verification

To verify the fix, re-run the test using the updated tolerance values or comparison method:

python test/inductor/test_aot_inductor_arrayref.py AOTInductorTestABICompatibleCpuWithStackAllocation.test_deconv_freezing_cpu_with_stack_allocation

If the test passes, it indicates that the updated tolerance values or comparison method have resolved the issue.

Extra Tips

  • When working with floating-point calculations, it's essential to consider the architecture-specific differences in rounding and precision.
  • Using more robust comparison methods, such as torch.allclose, can help mitigate issues related to floating-point precision.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING