pytorch - 💡(How to fix) Fix FlexAttention: extend AuxRequest to return min_scores and key indices for row-wise max/min [1 participants]

pytorch2026-03-08 18:43:52

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#176837•Fetched 2026-04-08 00:24:12

View on GitHub

Comments

Participants

Timeline

Reactions

Author

sirluk

Participants

sirluk

Timeline (top)

mentioned ×8subscribed ×8labeled ×4

Root Cause

FlexAttention is a great API for prototyping new attention variants, especially for researchers who are not familiar with CUDA or Triton. Because of that, I’d like to extend FlexAttention’s AuxOutput API so users can optionally request more row-wise information from the attention score reduction. Since flex_attention is still a prototype feature in PyTorch, this seems like a good time to add this functionality.

Code Example

class AuxRequest(NamedTuple):
    lse: bool = False
    max_scores: bool = False
    min_scores: bool = False
    max_indices: bool = False
    min_indices: bool = False

class AuxOutput(NamedTuple):
    lse: Tensor | None = None
    max_scores: Tensor | None = None
    min_scores: Tensor | None = None
    max_indices: Tensor | None = None
    min_indices: Tensor | None = None

RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

Currently, AuxRequest supports returning lse and max_scores. I’d like to also make it possible to request:

min_scores
key index for max scores (e.g. max_indices)
key index for min scores (e.g. min_indices)

There are many use cases in sparse attention and, more generally, KV cache management where access to row wise maximum and minimum QK dot products and their positions would be useful.

Concretely I image the API to look something like this

class AuxRequest(NamedTuple):
    lse: bool = False
    max_scores: bool = False
    min_scores: bool = False
    max_indices: bool = False
    min_indices: bool = False

class AuxOutput(NamedTuple):
    lse: Tensor | None = None
    max_scores: Tensor | None = None
    min_scores: Tensor | None = None
    max_indices: Tensor | None = None
    min_indices: Tensor | None = None

Alternatives

No response

Additional context

No response

cc @chauhang @penguinwu @Chillee @drisspg @yanboliang @BoyuanFeng @liangel-02 @howardzhang-cv

extent analysis

Fix Plan

Update AuxRequest and AuxOutput Classes

To add the new functionality, we need to update the AuxRequest and AuxOutput classes to include the new fields.

from typing import NamedTuple, Optional

class AuxRequest(NamedTuple):
    lse: bool = False
    max_scores: bool = False
    min_scores: bool = False
    max_indices: bool = False
    min_indices: bool = False

class AuxOutput(NamedTuple):
    lse: Optional[Tensor] = None
    max_scores: Optional[Tensor] = None
    min_scores: Optional[Tensor] = None
    max_indices: Optional[Tensor] = None
    min_indices: Optional[Tensor] = None

Update FlexAttention API

We need to update the FlexAttention API to return the new fields in the AuxOutput class.

def forward(self, ...):
    # ...
    aux_output = AuxOutput(
        lse=self.lse,
        max_scores=self.max_scores,
        min_scores=self.min_scores,
        max_indices=self.max_indices,
        min_indices=self.min_indices
    )
    return aux_output

Update Usage Example

We need to update the usage example to include the new fields.

aux_request = AuxRequest(
    lse=True,
    max_scores=True,
    min_scores=True,
    max_indices=True,
    min_indices=True
)

aux_output = flex_attention(..., aux_request=aux_request)
print(aux_output)

Verification

To verify that the fix worked, we can check that the AuxOutput class contains the new fields and that they are populated correctly.

print(aux_output.lse)
print(aux_output.max_scores)
print(aux_output.min_scores)
print(aux_output.max_indices)
print(aux_output.min_indices)

Extra Tips

Make sure to update the

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #retriever error #indexing error #inference speed

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix FlexAttention: extend AuxRequest to return min_scores and key indices for row-wise max/min [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

🚀 The feature, motivation and pitch

Alternatives

Additional context

extent analysis

Fix Plan

Update AuxRequest and AuxOutput Classes

Update FlexAttention API

Update Usage Example

Verification

Extra Tips

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix FlexAttention: extend AuxRequest to return min_scores and key indices for row-wise max/min [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

🚀 The feature, motivation and pitch

Alternatives

Additional context

extent analysis

Fix Plan

Update AuxRequest and AuxOutput Classes

Update FlexAttention API

Update Usage Example

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING