pytorch - ✅(Solved) Fix `torch.compile` silently succeeds on `select → unsqueeze → cat` pattern where eager raises IndexError [1 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#178881Fetched 2026-04-08 01:57:16
View on GitHub
Comments
2
Participants
3
Timeline
67
Reactions
0
Author
Timeline (top)
mentioned ×26subscribed ×26labeled ×10commented ×2

Error Message

import os os.environ["TRITON_BACKENDS_IN_TREE"] = "1"

import torch import torch.nn as nn

class Model(nn.Module): def init(self): super().init() self.embedding = nn.Embedding(num_embeddings=100, embedding_dim=32) self.linear = nn.Linear(32, 16)

def forward(self, x):
    embedded = self.embedding(x)
    embedded = embedded.float()
    base_tensor = self.linear(embedded)

    # This makes dim=1 have size 1
    base_tensor = base_tensor.unsqueeze(1)  # shape: [batch, 1, seq_len, 16]

    # select(dim=1, index=0) is valid
    # select(dim=1, index=1) is OUT OF BOUNDS — size is 1!
    # select(dim=1, index=2) is OUT OF BOUNDS — size is 1!
    slice0 = torch.select(base_tensor, dim=1, index=0)
    slice1 = torch.select(base_tensor, dim=1, index=1)  # <- should FAIL
    slice2 = torch.select(base_tensor, dim=1, index=2)  # <- should FAIL

    unsqueezed0 = torch.unsqueeze(slice0, dim=1)
    unsqueezed1 = torch.unsqueeze(slice1, dim=1)
    unsqueezed2 = torch.unsqueeze(slice2, dim=1)

    result = torch.cat([unsqueezed0, unsqueezed1, unsqueezed2], dim=1)
    return result

device = "cuda" model = Model().to(device).eval() x = torch.randint(0, 100, (4, 10), dtype=torch.long, device=device)

Eager: should raise IndexError

try: with torch.no_grad(): eager_out = model(x) print(f"eager: OK shape={eager_out.shape}") except (IndexError, RuntimeError) as e: print(f"eager: ERROR — {e}")

Expected: eager: ERROR — select(): index 1 out of range...

Compiled: should also raise, but SILENTLY SUCCEEDS

torch._dynamo.reset() compiled = torch.compile(model, backend="inductor") try: with torch.no_grad(): comp_out = compiled(x) print(f"compile: OK shape={comp_out.shape}") except Exception as e: print(f"compile: ERROR — {e}")

Actual: compile: OK — should have raised the same error!

Root Cause

The model creates an embedding + linear pipeline, then inserts an unsqueeze(1) that makes dim=1 have size 1, followed by three consecutive select(dim=1, index=0/1/2) operations. Since the tensor only has size 1 at dim=1, indices 1 and 2 are out of bounds. Eager mode correctly raises an error, but torch.compile with Inductor silently produces an output — likely because the merge_unbind_stack_aten optimization pass bypasses the bounds check.

Fix Action

Fixed

PR fix notes

PR #178926: [inductor] Preserve select bounds checks in split-cat pass

Description (problem / solution / changelog)

Fixes #178881

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

Changed files

  • test/inductor/test_split_cat_fx_passes.py (modified, +37/-0)
  • torch/_inductor/fx_passes/split_cat.py (modified, +5/-0)

Code Example

import os
os.environ["TRITON_BACKENDS_IN_TREE"] = "1"

import torch
import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.embedding = nn.Embedding(num_embeddings=100, embedding_dim=32)
        self.linear = nn.Linear(32, 16)

    def forward(self, x):
        embedded = self.embedding(x)
        embedded = embedded.float()
        base_tensor = self.linear(embedded)

        # This makes dim=1 have size 1
        base_tensor = base_tensor.unsqueeze(1)  # shape: [batch, 1, seq_len, 16]

        # select(dim=1, index=0) is valid
        # select(dim=1, index=1) is OUT OF BOUNDS — size is 1!
        # select(dim=1, index=2) is OUT OF BOUNDS — size is 1!
        slice0 = torch.select(base_tensor, dim=1, index=0)
        slice1 = torch.select(base_tensor, dim=1, index=1)  # <- should FAIL
        slice2 = torch.select(base_tensor, dim=1, index=2)  # <- should FAIL

        unsqueezed0 = torch.unsqueeze(slice0, dim=1)
        unsqueezed1 = torch.unsqueeze(slice1, dim=1)
        unsqueezed2 = torch.unsqueeze(slice2, dim=1)

        result = torch.cat([unsqueezed0, unsqueezed1, unsqueezed2], dim=1)
        return result


device = "cuda"
model = Model().to(device).eval()
x = torch.randint(0, 100, (4, 10), dtype=torch.long, device=device)

# Eager: should raise IndexError
try:
    with torch.no_grad():
        eager_out = model(x)
    print(f"eager: OK shape={eager_out.shape}")
except (IndexError, RuntimeError) as e:
    print(f"eager: ERROR — {e}")
# Expected: eager: ERRORselect(): index 1 out of range...

# Compiled: should also raise, but SILENTLY SUCCEEDS
torch._dynamo.reset()
compiled = torch.compile(model, backend="inductor")
try:
    with torch.no_grad():
        comp_out = compiled(x)
    print(f"compile: OK shape={comp_out.shape}")
except Exception as e:
    print(f"compile: ERROR — {e}")
# Actual: compile: OK — should have raised the same error!

---

IndexError: select(): index 1 out of range for tensor of size [4, 1, 10, 16] at dimension 1

---

(no error — silently returns output tensor)

---

PyTorch version: 2.12.0.dev20260327+cu126
CUDA used to build PyTorch: 12.6
OS: Ubuntu 22.04.5 LTS (x86_64)WSL2
Python version: 3.10.12
GPU: NVIDIA GeForce RTX 3060 Laptop GPU
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

torch.compile with inductor backend silently succeeds on a model that performs torch.select(dim=1, index=1) on a tensor with size 1 at dimension 1 — where eager mode correctly raises IndexError: select(): index 1 out of range for tensor of size [4, 1, 10, 16] at dimension 1.

The model creates an embedding + linear pipeline, then inserts an unsqueeze(1) that makes dim=1 have size 1, followed by three consecutive select(dim=1, index=0/1/2) operations. Since the tensor only has size 1 at dim=1, indices 1 and 2 are out of bounds. Eager mode correctly raises an error, but torch.compile with Inductor silently produces an output — likely because the merge_unbind_stack_aten optimization pass bypasses the bounds check.

This was discovered by WhiteFox fuzzer in experiments E4 (trial_4), E9, E11, and E12 (round-001), flagged as jit_status mismatch.

Minimal reproducer

import os
os.environ["TRITON_BACKENDS_IN_TREE"] = "1"

import torch
import torch.nn as nn

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.embedding = nn.Embedding(num_embeddings=100, embedding_dim=32)
        self.linear = nn.Linear(32, 16)

    def forward(self, x):
        embedded = self.embedding(x)
        embedded = embedded.float()
        base_tensor = self.linear(embedded)

        # This makes dim=1 have size 1
        base_tensor = base_tensor.unsqueeze(1)  # shape: [batch, 1, seq_len, 16]

        # select(dim=1, index=0) is valid
        # select(dim=1, index=1) is OUT OF BOUNDS — size is 1!
        # select(dim=1, index=2) is OUT OF BOUNDS — size is 1!
        slice0 = torch.select(base_tensor, dim=1, index=0)
        slice1 = torch.select(base_tensor, dim=1, index=1)  # <- should FAIL
        slice2 = torch.select(base_tensor, dim=1, index=2)  # <- should FAIL

        unsqueezed0 = torch.unsqueeze(slice0, dim=1)
        unsqueezed1 = torch.unsqueeze(slice1, dim=1)
        unsqueezed2 = torch.unsqueeze(slice2, dim=1)

        result = torch.cat([unsqueezed0, unsqueezed1, unsqueezed2], dim=1)
        return result


device = "cuda"
model = Model().to(device).eval()
x = torch.randint(0, 100, (4, 10), dtype=torch.long, device=device)

# Eager: should raise IndexError
try:
    with torch.no_grad():
        eager_out = model(x)
    print(f"eager: OK shape={eager_out.shape}")
except (IndexError, RuntimeError) as e:
    print(f"eager: ERROR — {e}")
# Expected: eager: ERROR — select(): index 1 out of range...

# Compiled: should also raise, but SILENTLY SUCCEEDS
torch._dynamo.reset()
compiled = torch.compile(model, backend="inductor")
try:
    with torch.no_grad():
        comp_out = compiled(x)
    print(f"compile: OK shape={comp_out.shape}")
except Exception as e:
    print(f"compile: ERROR — {e}")
# Actual: compile: OK — should have raised the same error!

Behavior summary

ModeBehaviorCorrect?
EagerIndexError: select(): index 1 out of range for tensor of size [4, 1, 10, 16] at dimension 1Yes
torch.compileSilently succeeds and returns a tensorNo

Root cause hypothesis

The merge_unbind_stack_aten pass in Inductor's split-cat optimization recognizes the pattern of consecutive select → unsqueeze → cat operations and attempts to merge them back into the original tensor (essentially recognizing that unbind→stack is an identity). However, the optimization applies even when the indices are out of bounds, skipping the runtime bounds check that eager mode performs. The pass likely statically determines the pattern without validating that each select index is within the tensor's actual dimension size.

Error logs

Eager mode (correct behavior):

IndexError: select(): index 1 out of range for tensor of size [4, 1, 10, 16] at dimension 1

torch.compile (incorrect — should raise the same error):

(no error — silently returns output tensor)

Versions

PyTorch version: 2.12.0.dev20260327+cu126
CUDA used to build PyTorch: 12.6
OS: Ubuntu 22.04.5 LTS (x86_64) — WSL2
Python version: 3.10.12
GPU: NVIDIA GeForce RTX 3060 Laptop GPU

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

extent analysis

TL;DR

The issue can be mitigated by avoiding the use of torch.compile with the inductor backend for models that perform out-of-bounds torch.select operations, or by manually adding bounds checks before the select operations.

Guidance

  • Identify and review all torch.select operations in the model to ensure that the indices are within the valid range of the tensor's dimensions.
  • Consider adding manual bounds checks before the select operations to raise an error if the index is out of range.
  • If possible, avoid using the inductor backend with torch.compile for models that perform out-of-bounds torch.select operations, and instead use eager mode or a different backend.
  • Verify that the model behaves correctly in eager mode before compiling it with the inductor backend.

Example

# Manual bounds check before torch.select
if index >= base_tensor.size(1):
    raise IndexError(f"select(): index {index} out of range for tensor of size {base_tensor.size()} at dimension 1")
slice1 = torch.select(base_tensor, dim=1, index=1)

Notes

  • The root cause of the issue is likely due to the merge_unbind_stack_aten optimization pass in the inductor backend, which bypasses the runtime bounds check for torch.select operations.
  • The issue may be specific to the inductor backend and may not occur with other backends or in eager mode.

Recommendation

Apply a workaround by manually adding bounds checks before the select operations, as the inductor backend may not be correctly handling out-of-bounds indices. This will ensure that the model raises an error if an out-of-bounds index is used, rather than silently producing an incorrect output.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - ✅(Solved) Fix `torch.compile` silently succeeds on `select → unsqueeze → cat` pattern where eager raises IndexError [1 pull requests, 2 comments, 3 participants]