pytorch - 💡(How to fix) Fix `torch.compile` fails on `index_fill` after `permute`: functionalization assertion (`copy_`) while eager works [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#178952Fetched 2026-04-08 01:56:53
View on GitHub
Comments
2
Participants
2
Timeline
14
Reactions
0
Timeline (top)
labeled ×5mentioned ×3subscribed ×3commented ×2

Error Message

AssertionError: n=copy_, n.args[0]=empty, ...

Code Example

AssertionError: n=copy_, n.args[0]=empty, ...

---

import torch
import torch.nn as nn

class M(nn.Module):
    def forward(self, x, idx):
        x = x.permute(0, 2, 1)
        x = x.index_fill(1, idx, 0.0)
        return x

m = M()
x = torch.randn(5, 1, 3)
idx = torch.tensor([0, 2], dtype=torch.long)

# eager works
y = m(x)

# compile fails
cm = torch.compile(m)
cy = cm(x, idx)

---

AssertionError: n=copy_, n.args[0]=empty, placeholders={arg1_1, arg0_1}

---

%permute = aten.permute(...)
%index_put = aten.index_put(...)
%empty = aten.empty(...)
%copy_ = aten.copy_(...)

---

torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
AssertionError: n=copy_, n.args[0]=empty, placeholders={arg1_1, arg0_1}, graph=graph():
    %arg0_1 : [num_users=1] = placeholder[target=arg0_1]
    %arg1_1 : [num_users=1] = placeholder[target=arg1_1]
    %permute : [num_users=1] = call_function[target=torch.ops.aten.permute.default](args = (%arg0_1, [0, 2, 1]), kwargs = {})
    %scalar_tensor : [num_users=1] = call_function[target=torch.ops.aten.scalar_tensor.default](args = (0.0,), kwargs = {dtype: torch.float32, layout: torch.strided, device: cpu, pin_memory: False})
    %expand : [num_users=1] = call_function[target=torch.ops.aten.expand.default](args = (%scalar_tensor, [5, 2, 1]), kwargs = {})
    %index_put : [num_users=1] = call_function[target=torch.ops.aten.index_put.default](args = (%permute, [None, %arg1_1], %expand), kwargs = {})
    %empty : [num_users=1] = call_function[target=torch.ops.aten.empty.memory_format](args = ([5, 3, 1],), kwargs = {dtype: torch.float32, layout: torch.strided, device: cpu, pin_memory: False})
    %copy_ : [num_users=1] = call_function[target=torch.ops.aten.copy_.default](args = (%empty, %index_put), kwargs = {})
    return (copy_,)

---

PyTorch version:  2.11.0
Is debug build: True
CUDA used to build PyTorch: 12.6
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0
Clang version: Could not collect
CMake version: version 3.22.1
Libc version: glibc-2.35

Python version: 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-6.8.0-59-generic-x86_64-with-glibc2.35
Is CUDA available: True
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Summary

A minimal model using permute followed by index_fill runs correctly in eager mode, but fails under torch.compile with an assertion from AOTAutograd functionalization:

AssertionError: n=copy_, n.args[0]=empty, ...

The failure occurs during graph capture before codegen, indicating that the graph is not considered functional due to a copy_ introduced in lowering.


Minimal Reproduction

import torch
import torch.nn as nn

class M(nn.Module):
    def forward(self, x, idx):
        x = x.permute(0, 2, 1)
        x = x.index_fill(1, idx, 0.0)
        return x

m = M()
x = torch.randn(5, 1, 3)
idx = torch.tensor([0, 2], dtype=torch.long)

# eager works
y = m(x)

# compile fails
cm = torch.compile(m)
cy = cm(x, idx)

Actual Behavior

The compiled path fails with:

AssertionError: n=copy_, n.args[0]=empty, placeholders={arg1_1, arg0_1}

Relevant FX graph snippet:

%permute = aten.permute(...)
%index_put = aten.index_put(...)
%empty = aten.empty(...)
%copy_ = aten.copy_(...)

Full error:

torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
AssertionError: n=copy_, n.args[0]=empty, placeholders={arg1_1, arg0_1}, graph=graph():
    %arg0_1 : [num_users=1] = placeholder[target=arg0_1]
    %arg1_1 : [num_users=1] = placeholder[target=arg1_1]
    %permute : [num_users=1] = call_function[target=torch.ops.aten.permute.default](args = (%arg0_1, [0, 2, 1]), kwargs = {})
    %scalar_tensor : [num_users=1] = call_function[target=torch.ops.aten.scalar_tensor.default](args = (0.0,), kwargs = {dtype: torch.float32, layout: torch.strided, device: cpu, pin_memory: False})
    %expand : [num_users=1] = call_function[target=torch.ops.aten.expand.default](args = (%scalar_tensor, [5, 2, 1]), kwargs = {})
    %index_put : [num_users=1] = call_function[target=torch.ops.aten.index_put.default](args = (%permute, [None, %arg1_1], %expand), kwargs = {})
    %empty : [num_users=1] = call_function[target=torch.ops.aten.empty.memory_format](args = ([5, 3, 1],), kwargs = {dtype: torch.float32, layout: torch.strided, device: cpu, pin_memory: False})
    %copy_ : [num_users=1] = call_function[target=torch.ops.aten.copy_.default](args = (%empty, %index_put), kwargs = {})
    return (copy_,)

Versions

PyTorch version:  2.11.0
Is debug build: True
CUDA used to build PyTorch: 12.6
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0
Clang version: Could not collect
CMake version: version 3.22.1
Libc version: glibc-2.35

Python version: 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-6.8.0-59-generic-x86_64-with-glibc2.35
Is CUDA available: True

cc @chauhang @penguinwu

extent analysis

TL;DR

The issue can be resolved by modifying the model to avoid the copy_ operation introduced during graph capture.

Guidance

  • The error occurs due to the copy_ operation introduced in the lowered graph, which is not considered functional.
  • To mitigate this, try modifying the index_fill operation to avoid the copy_ introduction.
  • Verify the fix by checking if the compiled model runs without errors.
  • If the issue persists, try updating the PyTorch version to a newer release, as the current version (

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING