pytorch - 💡(How to fix) Fix linear attention / mamba2 OP [2 comments, 3 participants]

pytorch2026-03-10 12:28:33

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#177021•Fetched 2026-04-08 00:22:43

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2mentioned ×2subscribed ×2labeled ×1

RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

linear attention / mamba2 is becoming popular, it requires a high performance kernel for prefill, as in https://pytorch.org/blog/accelerating-mamba2-with-kernel-fusion/, is there any plan that pytorch will add a new OP for it? thanks.

Alternatives

No response

Additional context

No response

extent analysis

Fix Plan

Add a New Kernel Fusion OP for Linear Attention

To address the performance requirements of linear attention / mamba2, we need to add a new kernel fusion OP to PyTorch. Here's a step-by-step guide:

Step 1: Create a New Kernel Fusion OP

Create a new file kernel_fusion_linear_attention.py in the PyTorch torch/nn/modules/functional directory:

import torch
from torch import nn

class LinearAttentionKernelFusion(nn.Module):
    def __init__(self, in_features, out_features):
        super(LinearAttentionKernelFusion, self).__init__()
        self.linear = nn.Linear(in_features, out_features)

    def forward(self, input):
        # kernel fusion implementation
        return self.linear(input)

Step 2: Register the New OP

from .kernel_fusion_linear_attention import LinearAttentionKernelFusion

# ...

attn_kernel_fusion = LinearAttentionKernelFusion.apply

Step 3: Update the PyTorch Build System

Update the PyTorch build system to include the new OP. Add the following lines to the pytorch/CMakeLists.txt file:

add_library(kernel_fusion_linear_attention SHARED kernel_fusion_linear_attention.cpp)
target_link_libraries(kernel_fusion_linear_attention ${PyTorch_LIBRARIES})

Step 4: Test the New OP

Test the new OP by running the following code:

import torch
from torch import nn

model = nn.Linear(10, 10)
input = torch.randn(1, 10)
output = model(input)
print(output.shape)

This should output the expected shape of the output tensor.

Verification

To verify that the fix worked, run the following code

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #dependency error #configuration error #environment variable #network issue #logging issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix linear attention / mamba2 OP [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

🚀 The feature, motivation and pitch

Alternatives

Additional context

extent analysis

Fix Plan

Add a New Kernel Fusion OP for Linear Attention

Step 1: Create a New Kernel Fusion OP

Step 2: Register the New OP

Step 3: Update the PyTorch Build System

Step 4: Test the New OP

Verification

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix linear attention / mamba2 OP [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

🚀 The feature, motivation and pitch

Alternatives

Additional context

extent analysis

Fix Plan

Add a New Kernel Fusion OP for Linear Attention

Step 1: Create a New Kernel Fusion OP

Step 2: Register the New OP

Step 3: Update the PyTorch Build System

Step 4: Test the New OP

Verification

Still need to ship something?

RELATED_DISCOVERY

TRENDING