pytorch - ✅(Solved) Fix DISABLED test_jvp_triangular_solve_cpu_float32 (__main__.TestOperatorsCPU) [4 pull requests, 4 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#182091Fetched 2026-05-02 05:27:28
View on GitHub
Comments
4
Participants
4
Timeline
70
Reactions
0
Author
Timeline (top)
mentioned ×24subscribed ×24labeled ×10referenced ×5

Root Cause

This test was disabled because it is failing on main branch (recent examples).

Fix Action

Fixed

PR fix notes

PR #177012: Accelerate SDPA on Arm CPUs: Update OpenBLAS to v0.3.33

Description (problem / solution / changelog)

Stack from ghstack (oldest at bottom):

  • -> #177012

Fixes #182091 Fixes SVE128 part of #182091

OpenBLAS v0.3.31 adds support for BGEMM on SVE128, SVE256 machines and general optimizations for SBGEMM/BGEMM: https://github.com/OpenMathLib/OpenBLAS/pull/5419, https://github.com/OpenMathLib/OpenBLAS/pull/5399 ... among other things.

OpenBLAS v0.3.32 accelerates SBGEMM/BGEMM on SVE128 machines by ~20% through https://github.com/OpenMathLib/OpenBLAS/pull/5667.

OpenBLAS v0.3.33 contains an SBGEMM fix for non-SVE machines and adds detection logic for Neoverse-V3

This accelerates SDPA, and will be capitalized on by #172945 further to accelerate linear,mm, bmm, etc

Performance

Using this SDPA benchmark, here are the scaled-dot-production-attention speedups achieved with 16 Neoverse-V2 cores:

BHqHkvLqLkDcausalgqaSpeedup from #176881 vs currentSpeedup from #176881 and this PR vs currentSpeedup from #176881 , #177009 and this PR vs current
132820482048128TrueTrue+9.48%+14.91%+35.60%
132812048128FalseTrue-1.42%-2.79%-0.95%%
116166400640080FalseFalse+5.18%+11.60%+27.95%
120201500150064FalseFalse+6.63%+11.80%+24.86%
820201500150064FalseFalse+9.31%+17.12%+31.82%

PS: BGEMM means bf16 x bf16 -> bf16 and SBGEMM means: bf16 x bf16 -> fp32

Changed files

  • .ci/docker/common/install_openblas.sh (modified, +1/-1)
  • torch/testing/_internal/common_methods_invocations.py (modified, +2/-2)
RAW_BUFFERClick to expand / collapse

Platforms: <fill this in or delete. Valid labels are: asan, linux, mac, macos, rocm, win, windows.>

This test was disabled because it is failing on main branch (recent examples).

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @aditew01 @mruberry @jianyuh @nikitaved @walterddr @xwang233 @Lezcano @snadampal @milpuz01 @nikhil-arm @fadara01 @robert-hardwick @nWEIdia @Chillee @samdow @kshitij12345

extent analysis

TL;DR

The test test_jvp_triangular_solve_cpu_float32 in test_ops.py needs to be investigated and fixed to resolve the failure on the main branch.

Guidance

  • Investigate the recent failures on torch-ci to understand the error messages and stack traces.
  • Review the test case test_jvp_triangular_solve_cpu_float32 in test_ops.py to identify potential issues with the test or the underlying implementation.
  • Check if there are any known issues or fixes related to the jvp and triangular_solve functions.

Notes

The provided information lacks specific details about the error messages or the expected behavior of the test, making it challenging to provide a more targeted fix.

Recommendation

Apply workaround: Investigate and fix the test case test_jvp_triangular_solve_cpu_float32 to resolve the failure on the main branch, as there is no clear indication of a fixed version available.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING