pytorch - 💡(How to fix) Fix torch.compile(mode="max-autotune") fails with NotImplementedError: SliceView in get_stride()

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

import torch import torch.nn.functional as F

def fn(x, weight, bias): flat = torch.cat([weight.view(-1), bias.view(-1)], dim=0) flat_div = flat / 4.0

new_weight = flat_div.narrow(0, 0, weight.numel()).view_as(weight)
new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias)

return F.linear(x, new_weight, new_bias)

def main(): x = torch.randn(4, 64, device="cpu") weight = torch.randn(32, 64, device="cpu") bias = torch.randn(32, device="cpu")

print("Running Eager mode...")
expected_out = fn(x, weight, bias)
print("Eager mode successful.")

print("\nRunning Compiled mode (max-autotune)...")
opt_fn = torch.compile(fn, mode="max-autotune")

try:
    actual_out = opt_fn(x, weight, bias)
    
    torch.testing.assert_close(expected_out, actual_out)
    print("PASS: Compiled output matches Eager output!")
    
except Exception as e:
    import traceback
    traceback.print_exc()
    print("\n[!] Bug Successfully Reproduced")

if name == "main": main()

Root Cause

When using torch.compile with mode="max-autotune", passing a tensor sliced via narrow() and reshaped via view_as() into F.linear results in a NotImplementedError. The exception occurs in torch/_inductor/ir.py because SliceView does not implement get_stride().

Fix Action

Fix / Workaround

Running Compiled mode (max-autotune)... Traceback (most recent call last): File "/tmp/bug.py", line 26, in main actual_out = opt_fn(x, weight, bias) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1158, in compile_wrapper raise e.remove_dynamo_frames() from None # see TORCHDYNAMO_VERBOSE=1 File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1141, in compile_wrapper result = fn(*args, **kwargs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2619, in call result = self._torchdynamo_orig_backend( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2310, in call result = self._inner_convert( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 777, in call result = _compile( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2094, in _compile guarded_code, tracer_output = compile_inner(code, one_graph, hooks) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_utils_internal.py", line 96, in wrapper_function return function(*args, **kwargs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1679, in compile_inner result = _compile_inner(code, one_graph, hooks) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1739, in _compile_inner dynamo_output = compile_frame( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1584, in compile_frame bytecode, tracer_output = transform_code_object(code, transform) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1766, in transform_code_object tracer_output = transformations(instructions, code_options) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1555, in transform tracer_output = trace_frame( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 368, in _fn return fn(*args, **kwargs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 954, in trace_frame run_tracer() File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 935, in run_tracer tracer.run() File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1883, in run while self.step(): File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1536, in step self.dispatch_table[inst.opcode](self, inst) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 5449, in RETURN_VALUE self._return(inst) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 5422, in _return all_stack_locals_metadata = self.output.compile_subgraph( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2171, in compile_subgraph instructions, subgraph_pycode = self.compile_and_call_fx_graph( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2817, in compile_and_call_fx_graph compiled_fn = self.call_user_compiler(gm, self.example_inputs()) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2987, in call_user_compiler return self._call_user_compiler(gm, example_inputs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 3049, in _call_user_compiler compiled_fn = compiler_fn(gm, example_inputs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/repro/after_dynamo.py", line 159, in call compiled_gm = compiler_fn(gm, example_inputs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/init.py", line 2482, in call return compile_fx( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2705, in compile_fx return compile_fx( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2764, in compile_fx return _maybe_wrap_and_compile_fx_main( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2845, in _maybe_wrap_and_compile_fx_main return _compile_fx_main( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 3058, in _compile_fx_main raise e.remove_dynamo_frames() from None # see TORCHDYNAMO_VERBOSE=1 File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 3043, in _compile_fx_main return dynamo_common.aot_autograd( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/backends/common.py", line 123, in call cg = aot_module_simplified(gm, example_inputs, **self.kwargs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1238, in aot_module_simplified compiled_fn, _ = aot_stage2_compile( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 357, in aot_stage2_compile return aot_stage2_inference( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 481, in aot_stage2_inference compiled_fw = _aot_stage2b_inference_compile( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 408, in _aot_stage2b_inference_compile return _aot_stage2b_compile_forward_or_inference( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 2779, in _aot_stage2b_compile_forward_or_inference compiled_fw_func = compiler(fw_module, adjusted_flat_args) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py", line 1460, in call output_code = self.compiler_fn(gm, example_inputs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2906, in fw_compiler_base return compile_fx_forward( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2534, in compile_fx_forward result = inner_compile( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/contextlib.py", line 79, in inner return func(*args, **kwds) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 836, in compile_fx_inner return wrap_compiler_debug(_compile_fx_inner, compiler_name="inductor")( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/repro/after_aot.py", line 317, in debug_wrapper inner_compiled_fn = compiler_fn(gm, example_inputs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1078, in _compile_fx_inner raise InductorError(e, currentframe()).with_traceback( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1058, in _compile_fx_inner mb_compiled_graph = fx_codegen_and_compile( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1845, in fx_codegen_and_compile return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1521, in codegen_and_compile graph.run(*example_inputs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1079, in run return super().run(*args) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/fx/interpreter.py", line 197, in run self.env[node] = self.run_node(node) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1932, in run_node result = super().run_node(n) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/fx/interpreter.py", line 294, in run_node return getattr(self, n.op)(n.target, args, kwargs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1504, in call_function raise LoweringException( File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1481, in call_function out = lowerings[target](*args, **kwargs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/lowering.py", line 511, in wrapped out = decomp_fn(*args, **kwargs) File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/kernel/mm.py", line 673, in tuned_addmm inp.get_stride()[0] == 0 File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/ir.py", line 9452, in get_stride return self.data.get_stride() File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/ir.py", line 791, in get_stride raise NotImplementedError(type(self).name) torch._inductor.exc.InductorError: LoweringException: NotImplementedError: SliceView target: aten.addmm.default args[0]: TensorBox( SliceView( StorageBox( ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise( 'cpu', torch.float32, def inner_fn(index): i0 = index tmp0 = ops.load(buf2, i0) tmp1 = ops.constant(4.0, torch.float32) tmp2 = tmp0 / tmp1 return tmp2 , ranges=[2080], origin_node=div, origins=OrderedSet([div]), stack_traces = {, File "/tmp/bug.py", line 6, in fn, flat_div = flat / 4.0, , } ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None) ), size=[32], reindex=lambda i0: [i0 + 2048], origins=OrderedSet([slice_2, div]), stack_traces = {, File "/tmp/bug.py", line 9, in fn, new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias), , }, stack_traces = {, File "/tmp/bug.py", line 6, in fn, flat_div = flat / 4.0, , } ) ) args[1]: TensorBox(StorageBox( InputBuffer(name='arg2_1', layout=FixedLayout('cpu', torch.float32, size=[4, 64], stride=[64, 1])) )) args[2]: TensorBox( PermuteView(data=View(data=SliceView(data=StorageBox( ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise( 'cpu', torch.float32, def inner_fn(index): i0 = index tmp0 = ops.load(buf2, i0) tmp1 = ops.constant(4.0, torch.float32) tmp2 = tmp0 / tmp1 return tmp2 , ranges=[2080], origin_node=div, origins=OrderedSet([div]), stack_traces = {, File "/tmp/bug.py", line 6, in fn, flat_div = flat / 4.0, , } ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None) ), size=[2048], reindex=<function SliceView.create.<locals>.reindex at 0x7efccc3525f0>), size=[32, 64], reindex=<function View._dynamic_reshape_indexer.<locals>.reindex at 0x7efccc3524d0>), dims=[1, 0]) )NotImplementedError: SliceView target: aten.addmm.default args[0]: TensorBox( SliceView( StorageBox( ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise( 'cpu', torch.float32, def inner_fn(index): i0 = index tmp0 = ops.load(buf2, i0) tmp1 = ops.constant(4.0, torch.float32) tmp2 = tmp0 / tmp1 return tmp2 , ranges=[2080], origin_node=div, origins=OrderedSet([div]), stack_traces = {, File "/tmp/bug.py", line 6, in fn, flat_div = flat / 4.0, , } ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None) ), size=[32], reindex=lambda i0: [i0 + 2048], origins=OrderedSet([slice_2, div]), stack_traces = {, File "/tmp/bug.py", line 9, in fn, new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias), , }, stack_traces = {, File "/tmp/bug.py", line 6, in fn, flat_div = flat / 4.0, , } ) ) args[1]: TensorBox(StorageBox( InputBuffer(name='arg2_1', layout=FixedLayout('cpu', torch.float32, size=[4, 64], stride=[64, 1])) )) args[2]: TensorBox( PermuteView(data=View(data=SliceView(data=StorageBox( ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise( 'cpu', torch.float32, def inner_fn(index): i0 = index tmp0 = ops.load(buf2, i0) tmp1 = ops.constant(4.0, torch.float32) tmp2 = tmp0 / tmp1 return tmp2 , ranges=[2080], origin_node=div, origins=OrderedSet([div]), stack_traces = {, File "/tmp/bug.py", line 6, in fn, flat_div = flat / 4.0, , } ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None) ), size=[2048], reindex=<function SliceView.create.<locals>.reindex at 0x7efccc3525f0>), size=[32, 64], reindex=<function View._dynamic_reshape_indexer.<locals>.reindex at 0x7efccc3524d0>), dims=[1, 0]) ) Found from : File "/tmp/bug.py", line 11, in fn return F.linear(x, new_weight, new_bias)

Code Example

import torch
import torch.nn.functional as F

def fn(x, weight, bias):
    flat = torch.cat([weight.view(-1), bias.view(-1)], dim=0)
    flat_div = flat / 4.0
    
    new_weight = flat_div.narrow(0, 0, weight.numel()).view_as(weight)
    new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias)
    
    return F.linear(x, new_weight, new_bias)

def main():
    x = torch.randn(4, 64, device="cpu")
    weight = torch.randn(32, 64, device="cpu")
    bias = torch.randn(32, device="cpu")

    print("Running Eager mode...")
    expected_out = fn(x, weight, bias)
    print("Eager mode successful.")

    print("\nRunning Compiled mode (max-autotune)...")
    opt_fn = torch.compile(fn, mode="max-autotune")
    
    try:
        actual_out = opt_fn(x, weight, bias)
        
        torch.testing.assert_close(expected_out, actual_out)
        print("PASS: Compiled output matches Eager output!")
        
    except Exception as e:
        import traceback
        traceback.print_exc()
        print("\n[!] Bug Successfully Reproduced")

if __name__ == "__main__":
    main()

---

(torch-nightly) xyt19@Oasis:/tmp$ TORCHDYNAMO_VERBOSE=1 python bug.py
Running Eager mode...
Eager mode successful.

Running Compiled mode (max-autotune)...
Traceback (most recent call last):
  File "/tmp/bug.py", line 26, in main
    actual_out = opt_fn(x, weight, bias)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1158, in compile_wrapper
    raise e.remove_dynamo_frames() from None  # see TORCHDYNAMO_VERBOSE=1
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1141, in compile_wrapper
    result = fn(*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2619, in __call__
    result = self._torchdynamo_orig_backend(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2310, in __call__
    result = self._inner_convert(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 777, in __call__
    result = _compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2094, in _compile
    guarded_code, tracer_output = compile_inner(code, one_graph, hooks)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_utils_internal.py", line 96, in wrapper_function
    return function(*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1679, in compile_inner
    result = _compile_inner(code, one_graph, hooks)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1739, in _compile_inner
    dynamo_output = compile_frame(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1584, in compile_frame
    bytecode, tracer_output = transform_code_object(code, transform)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1766, in transform_code_object
    tracer_output = transformations(instructions, code_options)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1555, in transform
    tracer_output = trace_frame(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 368, in _fn
    return fn(*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 954, in trace_frame
    run_tracer()
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 935, in run_tracer
    tracer.run()
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1883, in run
    while self.step():
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1536, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 5449, in RETURN_VALUE
    self._return(inst)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 5422, in _return
    all_stack_locals_metadata = self.output.compile_subgraph(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2171, in compile_subgraph
    instructions, subgraph_pycode = self.compile_and_call_fx_graph(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2817, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm, self.example_inputs())
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2987, in call_user_compiler
    return self._call_user_compiler(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 3049, in _call_user_compiler
    compiled_fn = compiler_fn(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/repro/after_dynamo.py", line 159, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/__init__.py", line 2482, in __call__
    return compile_fx(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2705, in compile_fx
    return compile_fx(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2764, in compile_fx
    return _maybe_wrap_and_compile_fx_main(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2845, in _maybe_wrap_and_compile_fx_main
    return _compile_fx_main(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 3058, in _compile_fx_main
    raise e.remove_dynamo_frames() from None  # see TORCHDYNAMO_VERBOSE=1
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 3043, in _compile_fx_main
    return dynamo_common.aot_autograd(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/backends/common.py", line 123, in __call__
    cg = aot_module_simplified(gm, example_inputs, **self.kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1238, in aot_module_simplified
    compiled_fn, _ = aot_stage2_compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 357, in aot_stage2_compile
    return aot_stage2_inference(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 481, in aot_stage2_inference
    compiled_fw = _aot_stage2b_inference_compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 408, in _aot_stage2b_inference_compile
    return _aot_stage2b_compile_forward_or_inference(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 2779, in _aot_stage2b_compile_forward_or_inference
    compiled_fw_func = compiler(fw_module, adjusted_flat_args)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py", line 1460, in __call__
    output_code = self.compiler_fn(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2906, in fw_compiler_base
    return compile_fx_forward(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2534, in compile_fx_forward
    result = inner_compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 836, in compile_fx_inner
    return wrap_compiler_debug(_compile_fx_inner, compiler_name="inductor")(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/repro/after_aot.py", line 317, in debug_wrapper
    inner_compiled_fn = compiler_fn(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1078, in _compile_fx_inner
    raise InductorError(e, currentframe()).with_traceback(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1058, in _compile_fx_inner
    mb_compiled_graph = fx_codegen_and_compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1845, in fx_codegen_and_compile
    return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1521, in codegen_and_compile
    graph.run(*example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1079, in run
    return super().run(*args)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/fx/interpreter.py", line 197, in run
    self.env[node] = self.run_node(node)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1932, in run_node
    result = super().run_node(n)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/fx/interpreter.py", line 294, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1504, in call_function
    raise LoweringException(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1481, in call_function
    out = lowerings[target](*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/lowering.py", line 511, in wrapped
    out = decomp_fn(*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/kernel/mm.py", line 673, in tuned_addmm
    inp.get_stride()[0] == 0
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/ir.py", line 9452, in get_stride
    return self.data.get_stride()
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/ir.py", line 791, in get_stride
    raise NotImplementedError(type(self).__name__)
torch._inductor.exc.InductorError: LoweringException: NotImplementedError: SliceView
  target: aten.addmm.default
  args[0]: TensorBox(
    SliceView(
      StorageBox(
        ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise(
          'cpu',
          torch.float32,
          def inner_fn(index):
              i0 = index
              tmp0 = ops.load(buf2, i0)
              tmp1 = ops.constant(4.0, torch.float32)
              tmp2 = tmp0 / tmp1
              return tmp2
          ,
          ranges=[2080],
          origin_node=div,
          origins=OrderedSet([div]),
          stack_traces = {,
            File "/tmp/bug.py", line 6, in fn,
              flat_div = flat / 4.0,
          ,
          }
        ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None)
      ),
      size=[32],
      reindex=lambda i0: [i0 + 2048],
      origins=OrderedSet([slice_2, div]),
      stack_traces = {,
        File "/tmp/bug.py", line 9, in fn,
          new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias),
      ,
      },
      stack_traces = {,
        File "/tmp/bug.py", line 6, in fn,
          flat_div = flat / 4.0,
      ,
      }
    )
  )
  args[1]: TensorBox(StorageBox(
    InputBuffer(name='arg2_1', layout=FixedLayout('cpu', torch.float32, size=[4, 64], stride=[64, 1]))
  ))
  args[2]: TensorBox(
    PermuteView(data=View(data=SliceView(data=StorageBox(
      ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise(
        'cpu',
        torch.float32,
        def inner_fn(index):
            i0 = index
            tmp0 = ops.load(buf2, i0)
            tmp1 = ops.constant(4.0, torch.float32)
            tmp2 = tmp0 / tmp1
            return tmp2
        ,
        ranges=[2080],
        origin_node=div,
        origins=OrderedSet([div]),
        stack_traces = {,
          File "/tmp/bug.py", line 6, in fn,
            flat_div = flat / 4.0,
        ,
        }
      ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None)
    ), size=[2048], reindex=<function SliceView.create.<locals>.reindex at 0x7efccc3525f0>), size=[32, 64], reindex=<function View._dynamic_reshape_indexer.<locals>.reindex at 0x7efccc3524d0>), dims=[1, 0])
  )NotImplementedError: SliceView
  target: aten.addmm.default
  args[0]: TensorBox(
    SliceView(
      StorageBox(
        ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise(
          'cpu',
          torch.float32,
          def inner_fn(index):
              i0 = index
              tmp0 = ops.load(buf2, i0)
              tmp1 = ops.constant(4.0, torch.float32)
              tmp2 = tmp0 / tmp1
              return tmp2
          ,
          ranges=[2080],
          origin_node=div,
          origins=OrderedSet([div]),
          stack_traces = {,
            File "/tmp/bug.py", line 6, in fn,
              flat_div = flat / 4.0,
          ,
          }
        ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None)
      ),
      size=[32],
      reindex=lambda i0: [i0 + 2048],
      origins=OrderedSet([slice_2, div]),
      stack_traces = {,
        File "/tmp/bug.py", line 9, in fn,
          new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias),
      ,
      },
      stack_traces = {,
        File "/tmp/bug.py", line 6, in fn,
          flat_div = flat / 4.0,
      ,
      }
    )
  )
  args[1]: TensorBox(StorageBox(
    InputBuffer(name='arg2_1', layout=FixedLayout('cpu', torch.float32, size=[4, 64], stride=[64, 1]))
  ))
  args[2]: TensorBox(
    PermuteView(data=View(data=SliceView(data=StorageBox(
      ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise(
        'cpu',
        torch.float32,
        def inner_fn(index):
            i0 = index
            tmp0 = ops.load(buf2, i0)
            tmp1 = ops.constant(4.0, torch.float32)
            tmp2 = tmp0 / tmp1
            return tmp2
        ,
        ranges=[2080],
        origin_node=div,
        origins=OrderedSet([div]),
        stack_traces = {,
          File "/tmp/bug.py", line 6, in fn,
            flat_div = flat / 4.0,
        ,
        }
      ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None)
    ), size=[2048], reindex=<function SliceView.create.<locals>.reindex at 0x7efccc3525f0>), size=[32, 64], reindex=<function View._dynamic_reshape_indexer.<locals>.reindex at 0x7efccc3524d0>), dims=[1, 0])
  )
Found from :
   File "/tmp/bug.py", line 11, in fn
    return F.linear(x, new_weight, new_bias)



[!] Bug Successfully Reproduced
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

When using torch.compile with mode="max-autotune", passing a tensor sliced via narrow() and reshaped via view_as() into F.linear results in a NotImplementedError. The exception occurs in torch/_inductor/ir.py because SliceView does not implement get_stride().

The code runs successfully in Eager mode but crashes during compilation.

Reproducible Script

import torch
import torch.nn.functional as F

def fn(x, weight, bias):
    flat = torch.cat([weight.view(-1), bias.view(-1)], dim=0)
    flat_div = flat / 4.0
    
    new_weight = flat_div.narrow(0, 0, weight.numel()).view_as(weight)
    new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias)
    
    return F.linear(x, new_weight, new_bias)

def main():
    x = torch.randn(4, 64, device="cpu")
    weight = torch.randn(32, 64, device="cpu")
    bias = torch.randn(32, device="cpu")

    print("Running Eager mode...")
    expected_out = fn(x, weight, bias)
    print("Eager mode successful.")

    print("\nRunning Compiled mode (max-autotune)...")
    opt_fn = torch.compile(fn, mode="max-autotune")
    
    try:
        actual_out = opt_fn(x, weight, bias)
        
        torch.testing.assert_close(expected_out, actual_out)
        print("PASS: Compiled output matches Eager output!")
        
    except Exception as e:
        import traceback
        traceback.print_exc()
        print("\n[!] Bug Successfully Reproduced")

if __name__ == "__main__":
    main()

Traceback

(torch-nightly) xyt19@Oasis:/tmp$ TORCHDYNAMO_VERBOSE=1 python bug.py
Running Eager mode...
Eager mode successful.

Running Compiled mode (max-autotune)...
Traceback (most recent call last):
  File "/tmp/bug.py", line 26, in main
    actual_out = opt_fn(x, weight, bias)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1158, in compile_wrapper
    raise e.remove_dynamo_frames() from None  # see TORCHDYNAMO_VERBOSE=1
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1141, in compile_wrapper
    result = fn(*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2619, in __call__
    result = self._torchdynamo_orig_backend(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2310, in __call__
    result = self._inner_convert(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 777, in __call__
    result = _compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 2094, in _compile
    guarded_code, tracer_output = compile_inner(code, one_graph, hooks)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_utils_internal.py", line 96, in wrapper_function
    return function(*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1679, in compile_inner
    result = _compile_inner(code, one_graph, hooks)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1739, in _compile_inner
    dynamo_output = compile_frame(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1584, in compile_frame
    bytecode, tracer_output = transform_code_object(code, transform)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1766, in transform_code_object
    tracer_output = transformations(instructions, code_options)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1555, in transform
    tracer_output = trace_frame(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 368, in _fn
    return fn(*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 954, in trace_frame
    run_tracer()
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 935, in run_tracer
    tracer.run()
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1883, in run
    while self.step():
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1536, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 5449, in RETURN_VALUE
    self._return(inst)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 5422, in _return
    all_stack_locals_metadata = self.output.compile_subgraph(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2171, in compile_subgraph
    instructions, subgraph_pycode = self.compile_and_call_fx_graph(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2817, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm, self.example_inputs())
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 2987, in call_user_compiler
    return self._call_user_compiler(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 3049, in _call_user_compiler
    compiled_fn = compiler_fn(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/repro/after_dynamo.py", line 159, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/__init__.py", line 2482, in __call__
    return compile_fx(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2705, in compile_fx
    return compile_fx(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2764, in compile_fx
    return _maybe_wrap_and_compile_fx_main(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2845, in _maybe_wrap_and_compile_fx_main
    return _compile_fx_main(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 3058, in _compile_fx_main
    raise e.remove_dynamo_frames() from None  # see TORCHDYNAMO_VERBOSE=1
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 3043, in _compile_fx_main
    return dynamo_common.aot_autograd(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/backends/common.py", line 123, in __call__
    cg = aot_module_simplified(gm, example_inputs, **self.kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1238, in aot_module_simplified
    compiled_fn, _ = aot_stage2_compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 357, in aot_stage2_compile
    return aot_stage2_inference(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 481, in aot_stage2_inference
    compiled_fw = _aot_stage2b_inference_compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 408, in _aot_stage2b_inference_compile
    return _aot_stage2b_compile_forward_or_inference(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/graph_compile.py", line 2779, in _aot_stage2b_compile_forward_or_inference
    compiled_fw_func = compiler(fw_module, adjusted_flat_args)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py", line 1460, in __call__
    output_code = self.compiler_fn(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2906, in fw_compiler_base
    return compile_fx_forward(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2534, in compile_fx_forward
    result = inner_compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 836, in compile_fx_inner
    return wrap_compiler_debug(_compile_fx_inner, compiler_name="inductor")(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_dynamo/repro/after_aot.py", line 317, in debug_wrapper
    inner_compiled_fn = compiler_fn(gm, example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1078, in _compile_fx_inner
    raise InductorError(e, currentframe()).with_traceback(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1058, in _compile_fx_inner
    mb_compiled_graph = fx_codegen_and_compile(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1845, in fx_codegen_and_compile
    return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1521, in codegen_and_compile
    graph.run(*example_inputs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1079, in run
    return super().run(*args)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/fx/interpreter.py", line 197, in run
    self.env[node] = self.run_node(node)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1932, in run_node
    result = super().run_node(n)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/fx/interpreter.py", line 294, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1504, in call_function
    raise LoweringException(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1481, in call_function
    out = lowerings[target](*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/lowering.py", line 511, in wrapped
    out = decomp_fn(*args, **kwargs)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/kernel/mm.py", line 673, in tuned_addmm
    inp.get_stride()[0] == 0
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/ir.py", line 9452, in get_stride
    return self.data.get_stride()
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_inductor/ir.py", line 791, in get_stride
    raise NotImplementedError(type(self).__name__)
torch._inductor.exc.InductorError: LoweringException: NotImplementedError: SliceView
  target: aten.addmm.default
  args[0]: TensorBox(
    SliceView(
      StorageBox(
        ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise(
          'cpu',
          torch.float32,
          def inner_fn(index):
              i0 = index
              tmp0 = ops.load(buf2, i0)
              tmp1 = ops.constant(4.0, torch.float32)
              tmp2 = tmp0 / tmp1
              return tmp2
          ,
          ranges=[2080],
          origin_node=div,
          origins=OrderedSet([div]),
          stack_traces = {,
            File "/tmp/bug.py", line 6, in fn,
              flat_div = flat / 4.0,
          ,
          }
        ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None)
      ),
      size=[32],
      reindex=lambda i0: [i0 + 2048],
      origins=OrderedSet([slice_2, div]),
      stack_traces = {,
        File "/tmp/bug.py", line 9, in fn,
          new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias),
      ,
      },
      stack_traces = {,
        File "/tmp/bug.py", line 6, in fn,
          flat_div = flat / 4.0,
      ,
      }
    )
  )
  args[1]: TensorBox(StorageBox(
    InputBuffer(name='arg2_1', layout=FixedLayout('cpu', torch.float32, size=[4, 64], stride=[64, 1]))
  ))
  args[2]: TensorBox(
    PermuteView(data=View(data=SliceView(data=StorageBox(
      ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise(
        'cpu',
        torch.float32,
        def inner_fn(index):
            i0 = index
            tmp0 = ops.load(buf2, i0)
            tmp1 = ops.constant(4.0, torch.float32)
            tmp2 = tmp0 / tmp1
            return tmp2
        ,
        ranges=[2080],
        origin_node=div,
        origins=OrderedSet([div]),
        stack_traces = {,
          File "/tmp/bug.py", line 6, in fn,
            flat_div = flat / 4.0,
        ,
        }
      ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None)
    ), size=[2048], reindex=<function SliceView.create.<locals>.reindex at 0x7efccc3525f0>), size=[32, 64], reindex=<function View._dynamic_reshape_indexer.<locals>.reindex at 0x7efccc3524d0>), dims=[1, 0])
  )NotImplementedError: SliceView
  target: aten.addmm.default
  args[0]: TensorBox(
    SliceView(
      StorageBox(
        ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise(
          'cpu',
          torch.float32,
          def inner_fn(index):
              i0 = index
              tmp0 = ops.load(buf2, i0)
              tmp1 = ops.constant(4.0, torch.float32)
              tmp2 = tmp0 / tmp1
              return tmp2
          ,
          ranges=[2080],
          origin_node=div,
          origins=OrderedSet([div]),
          stack_traces = {,
            File "/tmp/bug.py", line 6, in fn,
              flat_div = flat / 4.0,
          ,
          }
        ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None)
      ),
      size=[32],
      reindex=lambda i0: [i0 + 2048],
      origins=OrderedSet([slice_2, div]),
      stack_traces = {,
        File "/tmp/bug.py", line 9, in fn,
          new_bias = flat_div.narrow(0, weight.numel(), bias.numel()).view_as(bias),
      ,
      },
      stack_traces = {,
        File "/tmp/bug.py", line 6, in fn,
          flat_div = flat / 4.0,
      ,
      }
    )
  )
  args[1]: TensorBox(StorageBox(
    InputBuffer(name='arg2_1', layout=FixedLayout('cpu', torch.float32, size=[4, 64], stride=[64, 1]))
  ))
  args[2]: TensorBox(
    PermuteView(data=View(data=SliceView(data=StorageBox(
      ComputedBuffer(name='buf3', layout=FixedLayout('cpu', torch.float32, size=[2080], stride=[1]), data=Pointwise(
        'cpu',
        torch.float32,
        def inner_fn(index):
            i0 = index
            tmp0 = ops.load(buf2, i0)
            tmp1 = ops.constant(4.0, torch.float32)
            tmp2 = tmp0 / tmp1
            return tmp2
        ,
        ranges=[2080],
        origin_node=div,
        origins=OrderedSet([div]),
        stack_traces = {,
          File "/tmp/bug.py", line 6, in fn,
            flat_div = flat / 4.0,
        ,
        }
      ), _split_size=None, _original_inner_fn=None, _original_ranges=None, _original_reduction_ranges=None)
    ), size=[2048], reindex=<function SliceView.create.<locals>.reindex at 0x7efccc3525f0>), size=[32, 64], reindex=<function View._dynamic_reshape_indexer.<locals>.reindex at 0x7efccc3524d0>), dims=[1, 0])
  )
Found from :
   File "/tmp/bug.py", line 11, in fn
    return F.linear(x, new_weight, new_bias)



[!] Bug Successfully Reproduced

Versions

PyTorch version: 2.13.0.dev20260521+cu130 Is debug build: False CUDA used to build PyTorch: 13.0 ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.4 LTS (x86_64) GCC version: (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0 Clang version: 18.1.3 (1ubuntu1) CMake version: version 3.28.3 Libc version: glibc-2.39

Python version: 3.10.20 (main, Mar 11 2026, 17:46:40) [GCC 14.3.0] (64-bit runtime) Python platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.39 Is CUDA available: True CUDA runtime version: 12.0.140 Nvidia driver version: 596.49 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_tensor_ir.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.21.1 Is XPU available: False HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Caching allocator config: N/A ersions of relevant libraries: [pip3] numpy==2.2.6 [pip3] nvidia-cublas==13.1.1.3 [pip3] nvidia-cuda-cupti==13.0.85 [pip3] nvidia-cuda-nvrtc==13.0.88 [pip3] nvidia-cuda-runtime==13.0.96 [pip3] nvidia-cudnn-cu13==9.20.0.48 [pip3] nvidia-cufft==12.0.0.61 [pip3] nvidia-curand==10.4.0.35 [pip3] nvidia-cusolver==12.0.4.66 [pip3] nvidia-cusparse==12.6.3.3 [pip3] nvidia-cusparselt-cu13==0.8.1 [pip3] nvidia-nccl-cu13==2.29.7 [pip3] nvidia-nvjitlink==13.0.88 [pip3] nvidia-nvtx==13.0.85 [pip3] torch==2.13.0.dev20260521+cu130 [pip3] torchaudio==2.11.0.dev20260525+cu130 [pip3] torchvision==0.28.0.dev20260525+cu130 [pip3] triton==3.7.0+git88b227e2 [conda] numpy 2.2.6 pypi_0 pypi [conda] nvidia-cublas 13.1.1.3 pypi_0 pypi [conda] nvidia-cuda-cupti 13.0.85 pypi_0 pypi [conda] nvidia-cuda-nvrtc 13.0.88 pypi_0 pypi [conda] nvidia-cuda-runtime 13.0.96 pypi_0 pypi [conda] nvidia-cudnn-cu13 9.20.0.48 pypi_0 pypi [conda] nvidia-cufft 12.0.0.61 pypi_0 pypi [conda] nvidia-curand 10.4.0.35 pypi_0 pypi [conda] nvidia-cusolver 12.0.4.66 pypi_0 pypi [conda] nvidia-cusparse 12.6.3.3 pypi_0 pypi [conda] nvidia-cusparselt-cu13 0.8.1 pypi_0 pypi [conda] nvidia-nccl-cu13 2.29.7 pypi_0 pypi [conda] nvidia-nvjitlink 13.0.88 pypi_0 pypi [conda] nvidia-nvtx 13.0.85 pypi_0 pypi [conda] torch 2.13.0.dev20260521+cu130 pypi_0 pypi [conda] torchaudio 2.11.0.dev20260525+cu130 pypi_0 pypi [conda] torchvision 0.28.0.dev20260525+cu130 pypi_0 pypi [conda] triton 3.7.0+git88b227e2 pypi_0 pypi

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING