pytorch - 💡(How to fix) Fix AArch64 Unit Test Failures - Multiple failures in test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU for oneDNN [1 participants]

pytorch2026-03-12 10:55:37

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#177245•Fetched 2026-04-08 00:42:55

View on GitHub

Comments

Participants

Timeline

117

Reactions

Author

robert-hardwick

Participants

robert-hardwick

Timeline (top)

mentioned ×48subscribed ×48referenced ×15labeled ×5

Error Message

Traceback (most recent call last): File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/nn/test_convolution.py", line 3224, in test_conv_contiguous_for_oneDNN self.assertEqual(y, y_) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual raise error_metas.pop()[0].to_error( # type: ignore[index] AssertionError: Tensor-likes are not close!

Mismatched elements: 765360 / 4111104 (18.6%) Greatest absolute difference: 0.001953125 at index (0, 12, 78, 147) (up to 1e-05 allowed) Greatest relative difference: 362.0 at index (0, 13, 62, 151) (up to 0.001 allowed)

To execute this test, run the following from the base repo dir: python test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU.test_conv_contiguous_for_oneDNN_cpu

Code Example

Traceback (most recent call last):
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/nn/test_convolution.py", line 3224, in test_conv_contiguous_for_oneDNN
    self.assertEqual(y, y_)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual
    raise error_metas.pop()[0].to_error(  # type: ignore[index]
AssertionError: Tensor-likes are not close!

Mismatched elements: 765360 / 4111104 (18.6%)
Greatest absolute difference: 0.001953125 at index (0, 12, 78, 147) (up to 1e-05 allowed)
Greatest relative difference: 362.0 at index (0, 13, 62, 151) (up to 0.001 allowed)

To execute this test, run the following from the base repo dir:
    python test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU.test_conv_contiguous_for_oneDNN_cpu

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Failing Tests ( suspect the same cause )

python test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU.test_conv_contiguous_for_oneDNN_cpu python test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU.test_conv_ic1_channels_last_for_oneDNN_cpu

Example Traceback

Traceback (most recent call last):
  File "/builds/software-machine-learning-infra-frameworks-workspaces-robhar02/pytorch/test/nn/test_convolution.py", line 3224, in test_conv_contiguous_for_oneDNN
    self.assertEqual(y, y_)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4365, in assertEqual
    raise error_metas.pop()[0].to_error(  # type: ignore[index]
AssertionError: Tensor-likes are not close!

Mismatched elements: 765360 / 4111104 (18.6%)
Greatest absolute difference: 0.001953125 at index (0, 12, 78, 147) (up to 1e-05 allowed)
Greatest relative difference: 362.0 at index (0, 13, 62, 151) (up to 0.001 allowed)

To execute this test, run the following from the base repo dir:
    python test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU.test_conv_contiguous_for_oneDNN_cpu

Affects Neoverse-V1 and Neoverse-V2

Versions

Commit - https://github.com/pytorch/pytorch/commit/08b6f48d871affbc7abe9277020aed882fdf110a

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @aditew01 @gujinghui @PenghuiCheng @jianyuh @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen @snadampal

extent analysis

Fix Plan

The fix involves updating the convolution test to account for numerical precision differences between CPU and oneDNN implementations.

Update the test_conv_contiguous_for_oneDNN and test_conv_ic1_channels_last_for_oneDNN test functions to use a larger tolerance when comparing tensors.
Modify the assertEqual statement to use torch.testing.assert_close instead, which allows for a specified tolerance.

Example code changes:

import torch

# ...

def test_conv_contiguous_for_oneDNN(self):
    # ...
    torch.testing.assert_close(y, y_, rtol=1e-4, atol=1e-5)

def test_conv_ic1_channels_last_for_oneDNN(self):
    # ...
    torch.testing.assert_close(y, y_, rtol=1e-4, atol=1e-5)

Verification

To verify the fix, run the affected tests again:

python test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU.test_conv_contiguous_for_oneDNN_cpu
python test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU.test_conv_ic1_channels_last_for_oneDNN_cpu

If the tests pass, the fix is successful.

Extra Tips

When working with numerical computations, it's essential to consider the effects of numerical precision and rounding errors.
Using torch.testing.assert_close instead of assertEqual can help catch issues related to numerical precision.
Be cautious when updating tolerance values, as they may affect the accuracy of the tests.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #latency issue #model loading #dependency error #configuration error #environment variable

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix AArch64 Unit Test Failures - Multiple failures in test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU for oneDNN [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

Versions

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix AArch64 Unit Test Failures - Multiple failures in test/nn/test_convolution.py TestConvolutionNNDeviceTypeCPU for oneDNN [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

Versions

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING