pytorch - 💡(How to fix) Fix Inductor standalone_compile misses fallback output unbacked symbol binding

pytorch2026-05-27 08:24:25

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

torch._inductor.standalone_compile can generate Python wrapper code that uses an unbacked symbol from a fallback op output without first binding it from the runtime output tensor.

The minimal case is aten.repeat_interleave.Tensor: the fallback output shape is data-dependent, so the generated wrapper must define the output extent from buf.size(0). Instead, the wrapper can emit an assert_size_stride and downstream allocation using the symbol directly.

Error Message

NameError: name 'u4' is not defined

Root Cause

When fresh unbacked symbols are ignored, FallbackKernel.process_kernel can see a fallback output whose fake shape contains unbacked symbols, but compute_unbacked_bindings(...) returns None because those symbols are not in the pending-fresh set anymore.

The wrapper still needs runtime bindings for any unbacked symbols present in the fallback output layout, otherwise later wrapper code uses undefined variables.

Code Example

import torch
from torch._inductor import standalone_compile
from torch._subclasses.fake_tensor import FakeTensor
from torch.fx.experimental.proxy_tensor import make_fx


def fn(counts, x):
    idx = torch.repeat_interleave(counts)
    return x[idx].sin()


counts = torch.tensor([1, 2, 1, 0], device="cuda", dtype=torch.int64)
x = torch.randn(4, 8, device="cuda")
torch._dynamo.mark_dynamic(counts, 0, min=1, max=8)

gm = make_fx(fn, tracing_mode="symbolic")(counts, x)
fake_mode = next(
    node.meta["val"].fake_mode
    for node in gm.graph.nodes
    if isinstance(node.meta.get("val"), FakeTensor)
)

with (
    torch._guards.tracing(torch._guards.TracingContext(fake_mode)),
    fake_mode.shape_env.ignore_fresh_unbacked_symbols(),
):
    compiled = standalone_compile(
        gm,
        [counts, x],
        dynamic_shapes="from_tracing_context",
        options={},
    )

torch.testing.assert_close(compiled(counts, x), fn(counts, x))

---

buf0 = torch.ops.aten.repeat_interleave.Tensor(arg0_1)
buf1 = buf0
assert_size_stride(buf1, (u4,), (1,), "torch.ops.aten.repeat_interleave.Tensor")
buf2 = empty_strided_cuda((u4, s9), (s9, 1), torch.float32)

---

NameError: name 'u4' is not defined

---

buf0 = torch.ops.aten.repeat_interleave.Tensor(arg0_1)
u4 = buf0.size(0)
buf1 = buf0
assert_size_stride(buf1, (u4,), (1,), "torch.ops.aten.repeat_interleave.Tensor")

RAW_BUFFERClick to expand / collapse

Inductor standalone_compile misses fallback output unbacked symbol binding

Summary

torch._inductor.standalone_compile can generate Python wrapper code that uses an unbacked symbol from a fallback op output without first binding it from the runtime output tensor.

Repro

import torch
from torch._inductor import standalone_compile
from torch._subclasses.fake_tensor import FakeTensor
from torch.fx.experimental.proxy_tensor import make_fx


def fn(counts, x):
    idx = torch.repeat_interleave(counts)
    return x[idx].sin()


counts = torch.tensor([1, 2, 1, 0], device="cuda", dtype=torch.int64)
x = torch.randn(4, 8, device="cuda")
torch._dynamo.mark_dynamic(counts, 0, min=1, max=8)

gm = make_fx(fn, tracing_mode="symbolic")(counts, x)
fake_mode = next(
    node.meta["val"].fake_mode
    for node in gm.graph.nodes
    if isinstance(node.meta.get("val"), FakeTensor)
)

with (
    torch._guards.tracing(torch._guards.TracingContext(fake_mode)),
    fake_mode.shape_env.ignore_fresh_unbacked_symbols(),
):
    compiled = standalone_compile(
        gm,
        [counts, x],
        dynamic_shapes="from_tracing_context",
        options={},
    )

torch.testing.assert_close(compiled(counts, x), fn(counts, x))

Observed Failure

The generated wrapper contains code equivalent to:

buf0 = torch.ops.aten.repeat_interleave.Tensor(arg0_1)
buf1 = buf0
assert_size_stride(buf1, (u4,), (1,), "torch.ops.aten.repeat_interleave.Tensor")
buf2 = empty_strided_cuda((u4, s9), (s9, 1), torch.float32)

u4 is never defined, so execution fails with:

NameError: name 'u4' is not defined

Root Cause

The wrapper still needs runtime bindings for any unbacked symbols present in the fallback output layout, otherwise later wrapper code uses undefined variables.

Expected Behavior

The generated wrapper should bind the fallback output extent before using it:

buf0 = torch.ops.aten.repeat_interleave.Tensor(arg0_1)
u4 = buf0.size(0)
buf1 = buf0
assert_size_stride(buf1, (u4,), (1,), "torch.ops.aten.repeat_interleave.Tensor")

Proposed Fix

After compute_unbacked_bindings(...), if no bindings were produced but the fallback output still contains free unbacked symbols, derive bindings directly from the output structure using _free_unbacked_symbols_with_path(...). This keeps the fix local to fallback output binding and does not change graph chunking or symbolic-shape policy.

cc @chauhang @penguinwu @ezyang @bobrenjc93 @aditvenk @laithsakka @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix Inductor standalone_compile misses fallback output unbacked symbol binding

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Inductor standalone_compile misses fallback output unbacked symbol binding

Summary

Repro

Observed Failure

Root Cause

Expected Behavior

Proposed Fix

Still need to ship something?

TRENDING