pytorch - ✅(Solved) Fix DISABLED test_sdpa_rewriter_10_gpu (__main__.SDPAPatternRewriterGpuTests) [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#178984Fetched 2026-04-08 02:24:45
View on GitHub
Comments
1
Participants
2
Timeline
63
Reactions
0
Author
Timeline (top)
mentioned ×24subscribed ×24labeled ×8referenced ×3

Root Cause

This test was disabled because it is failing on main branch (recent examples).

Fix Action

Fixed

PR fix notes

PR #178986: [xpu][fix] Fix meta kernel for _scaled_dot_product_fused_attention_overrideable to preserve query layout

Description (problem / solution / changelog)

Stack from ghstack (oldest at bottom):

  • -> #178986
  • #178959

Motivation

The XPU kernel _scaled_dot_product_fused_attention_overrideable allocates its output using alloc_with_matching_layout, which assigns the output the same stride ordering as the query tensor. When query is non-contiguous (e.g., after permute(0, 2, 1, 3) in SDPA fusion patterns), this produces a non-contiguous output with strides matching the permuted layout.

This PR https://github.com/pytorch/pytorch/pull/178494 changes the behavior. The meta kernel allocates outputs using torch.empty, which always returns a contiguous tensor with default strides. This mismatch can cause Inductor to raise an AssertionError during stride validation at runtime on XPU CI.

Additional Context

fix https://github.com/pytorch/pytorch/issues/178984 fix https://github.com/pytorch/pytorch/issues/178974

Changed files

  • torch/_meta_registrations.py (modified, +1/-1)
RAW_BUFFERClick to expand / collapse

Platforms: xpu

This test was disabled because it is failing on main branch (recent examples).

cc @mruberry @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo @gujinghui @fengyuan14 @drisspg @liangel-02 @howardzhang-cv

extent analysis

TL;DR

The test failure in test_fused_attention.py on the main branch may be resolved by investigating and addressing the underlying issue causing the failure in SDPAPatternRewriterGpuTests.

Guidance

  • Investigate the recent examples of test failures on torch-ci.com to identify patterns or commonalities in the failures.
  • Review the test_fused_attention.py file, specifically the SDPAPatternRewriterGpuTests class, to understand the test case and potential causes of failure.
  • Consider re-enabling the test and running it locally to gather more information about the failure.

Notes

The provided information lacks specific technical details about the failure, making it challenging to provide a more targeted solution.

Recommendation

Apply workaround: Investigate and address the underlying issue causing the test failure, as the root cause is unclear from the provided information.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING