pytorch - 💡(How to fix) Fix Extend nvMatmulHeuristics config key tuple for stages/split_k [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#177578Fetched 2026-04-08 00:47:26
View on GitHub
Comments
0
Participants
1
Timeline
154
Reactions
0
Participants
Assignees
Timeline (top)
mentioned ×73subscribed ×73labeled ×7assigned ×1

Root Cause

Note: tile_k is currently excluded from the key because nvMatmulHeuristics and cutlass_api use it to mean different things.

RAW_BUFFERClick to expand / collapse

When cutlass_api adds support for stages/split_k, the kernel config key tuple in torch/_inductor/template_heuristics/nv_universal_gemm.py should be extended to include those parameters.

Note: tile_k is currently excluded from the key because nvMatmulHeuristics and cutlass_api use it to mean different things.

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv @nWEIdia @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

extent analysis

Fix Plan

The fix involves extending the kernel config key tuple in torch/_inductor/template_heuristics/nv_universal_gemm.py to include stages and split_k parameters.

Steps

  • Update the kernel_config_key function to include stages and split_k:
def kernel_config_key(operator, operand_shapes, operand_dtypes, 
                      operator_type, tile_n, tile_m, tile_k, 
                      stages, split_k):
    # ...
    key = (operator_type, tile_n, tile_m, stages, split_k)
    # ...
    return key
  • Ensure that stages and split_k are properly handled in the nvMatmulHeuristics and cutlass_api to avoid conflicts with tile_k.

Verification

  • Test the updated kernel_config_key function with different stages and split_k values to ensure it generates the correct key tuples.
  • Verify that the updated key tuples are correctly used in the nv_universal_gemm.py file.

Extra Tips

  • Be cautious when updating the kernel_config_key function to avoid introducing regressions.
  • Consider adding tests to cover different scenarios and ensure the fix works as expected.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING