pytorch - ✅(Solved) Fix DISABLED test_sdpa_prev_15_gpu (__main__.SDPAPatternRewriterGpuTests) [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#178974Fetched 2026-04-08 02:22:08
View on GitHub
Comments
1
Participants
2
Timeline
63
Reactions
0
Author
Timeline (top)
mentioned ×24subscribed ×24labeled ×8referenced ×3

Root Cause

This test was disabled because it is failing on main branch (recent examples).

Fix Action

Fixed

PR fix notes

PR #178986: [xpu][fix] Fix meta kernel for _scaled_dot_product_fused_attention_overrideable to preserve query layout

Description (problem / solution / changelog)

Stack from ghstack (oldest at bottom):

  • -> #178986
  • #178959

Motivation

The XPU kernel _scaled_dot_product_fused_attention_overrideable allocates its output using alloc_with_matching_layout, which assigns the output the same stride ordering as the query tensor. When query is non-contiguous (e.g., after permute(0, 2, 1, 3) in SDPA fusion patterns), this produces a non-contiguous output with strides matching the permuted layout.

This PR https://github.com/pytorch/pytorch/pull/178494 changes the behavior. The meta kernel allocates outputs using torch.empty, which always returns a contiguous tensor with default strides. This mismatch can cause Inductor to raise an AssertionError during stride validation at runtime on XPU CI.

Additional Context

fix https://github.com/pytorch/pytorch/issues/178984 fix https://github.com/pytorch/pytorch/issues/178974

Changed files

  • torch/_meta_registrations.py (modified, +1/-1)
RAW_BUFFERClick to expand / collapse

Platforms: xpu

This test was disabled because it is failing on main branch (recent examples).

cc @mruberry @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo @gujinghui @fengyuan14 @drisspg @liangel-02 @howardzhang-cv

extent analysis

TL;DR

The test failure in test_fused_attention.py on the main branch may be resolved by re-enabling and re-running the test after investigating the recent failures.

Guidance

  • Investigate the recent test failures on torch-ci.com to identify the root cause.
  • Re-enable the disabled test and re-run it to see if the issue persists.
  • Collaborate with the listed team members (@mruberry, @chauhang, etc.) to discuss potential fixes or workarounds.

Notes

The provided information lacks technical details about the test failure, making it challenging to provide a specific solution.

Recommendation

Apply workaround: Re-enable and re-run the test after investigating recent failures, as this may help identify and potentially resolve the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING