pytorch - ✅(Solved) Fix DISABLED test_scaled_mm (__main__.TestFlopCounter) [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#179960Fetched 2026-04-11 06:11:06
View on GitHub
Comments
1
Participants
2
Timeline
29
Reactions
0
Author
Timeline (top)
mentioned ×11subscribed ×11labeled ×6commented ×1

Root Cause

This test was disabled because it is failing on main branch (recent examples).

PR fix notes

PR #180518: [ROCm] Fix evaluate_platform_supports_fp8 false-positive

Description (problem / solution / changelog)

On ROCm, evaluate_platform_supports_fp8() did not return False when the device architecture did not match any supported FP8 GPU, causing the function to fall through and report support unconditionally.

Add an explicit False return when no supported ROCm architecture is matched, ensuring correct FP8 capability detection on unsupported devices.

Fixes #179960 Fixes #179949

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @jataylo @hongxiayang @naromero77amd @pragupta @jerrymannil @xinyazhang

Changed files

  • torch/testing/_internal/common_cuda.py (modified, +1/-0)

PR #180715: [ROCm] Fix evaluate_platform_supports_fp8 false-positive

Description (problem / solution / changelog)

On ROCm, evaluate_platform_supports_fp8() did not return False when the device architecture did not match any supported FP8 GPU, causing the function to fall through and report support unconditionally.

Add an explicit False return when no supported ROCm architecture is matched, ensuring correct FP8 capability detection on unsupported devices.

Fixes #179960 Fixes #179949

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @jataylo @hongxiayang @naromero77amd @pragupta @jerrymannil @xinyazhang

Changed files

  • torch/testing/_internal/common_cuda.py (modified, +1/-0)
RAW_BUFFERClick to expand / collapse

Platforms: rocm

This test was disabled because it is failing on main branch (recent examples).

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @jataylo @hongxiayang @naromero77amd @jerrymannil @xinyazhang @mruberry

extent analysis

TL;DR

  • Re-enable and investigate the failing test test_scaled_mm in test_flop_counter.py to identify the root cause of the failure on the main branch.

Guidance

  • Review the recent failure examples on torch-ci.com to understand the failure patterns and potential error messages.
  • Investigate the test_scaled_mm test case in test_flop_counter.py to determine the conditions under which it fails.
  • Check if there are any recent changes in the code or dependencies that could be causing the test to fail.

Notes

  • The issue lacks specific technical details about the failure, so a thorough investigation of the test case and failure examples is necessary.

Recommendation

  • Apply workaround: Re-enable the test and add additional logging or debugging statements to help identify the root cause of the failure.

FAIL-SAFE: Given the limited information, the best course of action is to investigate the test case and failure examples to gather more information before attempting a fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - ✅(Solved) Fix DISABLED test_scaled_mm (__main__.TestFlopCounter) [2 pull requests, 1 comments, 2 participants]