pytorch - 💡(How to fix) Fix [Bug] torch.kron fails with RuntimeError on non-contiguous inputs

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

import torch

Contiguous A, Non-contiguous B

A = torch.randn(3, 4, dtype=torch.float32, device="cuda") B = torch.randn(4, 3, dtype=torch.float32, device="cuda").t()

print("--- Eager Mode ---") try: torch.kron(A, B) except Exception as e: print(f"Eager Error: {e}")

print("\n--- Compile Mode ---") @torch.compile def compiled_kron(x, y): return torch.kron(x, y)

try: compiled_kron(A, B) except Exception as e: print(f"Compile Error: {e}")

print("\n--- Workaround ---")

works if we make it contiguous first

res = torch.kron(A, B.contiguous()) print(f"Success! Shape: {res.shape}")

Fix Action

Fix / Workaround

print("\n--- Workaround ---")

works if we make it contiguous first

res = torch.kron(A, B.contiguous()) print(f"Success! Shape: {res.shape}")


--- Workaround ---
Success! Shape: torch.Size([9, 16])

Code Example

import torch

# Contiguous A, Non-contiguous B
A = torch.randn(3, 4, dtype=torch.float32, device="cuda")
B = torch.randn(4, 3, dtype=torch.float32, device="cuda").t() 

print("--- Eager Mode ---")
try:
    torch.kron(A, B)
except Exception as e:
    print(f"Eager Error: {e}")

print("\n--- Compile Mode ---")
@torch.compile
def compiled_kron(x, y):
    return torch.kron(x, y)

try:
    compiled_kron(A, B)
except Exception as e:
    print(f"Compile Error: {e}")

print("\n--- Workaround ---")
# works if we make it contiguous first
res = torch.kron(A, B.contiguous())
print(f"Success! Shape: {res.shape}")

---

(torch-nightly) xyt19@Oasis:/tmp$ TORCHDYNAMO_VERBOSE=1 python bug.py
--- Eager Mode ---
Eager Error: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

--- Compile Mode ---
Compile Error: RuntimeError when making fake tensor call
  Explanation: Dynamo failed to run FX node with fake tensors: call_function <built-in method kron of type object at 0x7ff4203f3e80>(*(FakeTensor(..., device='cuda:0', size=(3, 4)), FakeTensor(..., device='cuda:0', size=(3, 4))), **{}): got ValueError('Cannot view a tensor with shape torch.Size([3, 3, 4, 4]) and strides (48, 1, 3, 12) as a tensor with shape (9, 16)!')
  Hint: Your code may result in an error when running in eager. Please double check that your code doesn't contain a similar error when actually running eager/uncompiled. You can do this by removing the `torch.compile` call, or by using `torch.compiler.set_stance("force_eager")`.

  Developer debug context:

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb4315.html

from user code:
   File "/tmp/bug.py", line 16, in compiled_kron
    return torch.kron(x, y)


--- Workaround ---
Success! Shape: torch.Size([9, 16])
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

torch.kron crashes with a RuntimeError when the second argument is non-contiguous (e.g., a transposed tensor).

In eager mode, it throws a RuntimeError: view size is not compatible with input tensor's size and stride. When compiled with torch.compile, it causes a TorchRuntimeError during FakeTensor propagation with a ValueError related to tensor strides.

Minimal Reproducible Example:

import torch

# Contiguous A, Non-contiguous B
A = torch.randn(3, 4, dtype=torch.float32, device="cuda")
B = torch.randn(4, 3, dtype=torch.float32, device="cuda").t() 

print("--- Eager Mode ---")
try:
    torch.kron(A, B)
except Exception as e:
    print(f"Eager Error: {e}")

print("\n--- Compile Mode ---")
@torch.compile
def compiled_kron(x, y):
    return torch.kron(x, y)

try:
    compiled_kron(A, B)
except Exception as e:
    print(f"Compile Error: {e}")

print("\n--- Workaround ---")
# works if we make it contiguous first
res = torch.kron(A, B.contiguous())
print(f"Success! Shape: {res.shape}")

Error Log

(torch-nightly) xyt19@Oasis:/tmp$ TORCHDYNAMO_VERBOSE=1 python bug.py
--- Eager Mode ---
Eager Error: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

--- Compile Mode ---
Compile Error: RuntimeError when making fake tensor call
  Explanation: Dynamo failed to run FX node with fake tensors: call_function <built-in method kron of type object at 0x7ff4203f3e80>(*(FakeTensor(..., device='cuda:0', size=(3, 4)), FakeTensor(..., device='cuda:0', size=(3, 4))), **{}): got ValueError('Cannot view a tensor with shape torch.Size([3, 3, 4, 4]) and strides (48, 1, 3, 12) as a tensor with shape (9, 16)!')
  Hint: Your code may result in an error when running in eager. Please double check that your code doesn't contain a similar error when actually running eager/uncompiled. You can do this by removing the `torch.compile` call, or by using `torch.compiler.set_stance("force_eager")`.

  Developer debug context:

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb4315.html

from user code:
   File "/tmp/bug.py", line 16, in compiled_kron
    return torch.kron(x, y)


--- Workaround ---
Success! Shape: torch.Size([9, 16])

Versions

PyTorch version: 2.13.0.dev20260521+cu130 Is debug build: False CUDA used to build PyTorch: 13.0 ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.4 LTS (x86_64) GCC version: (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0 Clang version: 18.1.3 (1ubuntu1) CMake version: version 3.28.3 Libc version: glibc-2.39

Python version: 3.10.20 (main, Mar 11 2026, 17:46:40) [GCC 14.3.0] (64-bit runtime) Python platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.39 Is CUDA available: True CUDA runtime version: 12.0.140 Nvidia driver version: 596.49 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_tensor_ir.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.21.1 Is XPU available: False HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Caching allocator config: N/A ersions of relevant libraries: [pip3] numpy==2.2.6 [pip3] nvidia-cublas==13.1.1.3 [pip3] nvidia-cuda-cupti==13.0.85 [pip3] nvidia-cuda-nvrtc==13.0.88 [pip3] nvidia-cuda-runtime==13.0.96 [pip3] nvidia-cudnn-cu13==9.20.0.48 [pip3] nvidia-cufft==12.0.0.61 [pip3] nvidia-curand==10.4.0.35 [pip3] nvidia-cusolver==12.0.4.66 [pip3] nvidia-cusparse==12.6.3.3 [pip3] nvidia-cusparselt-cu13==0.8.1 [pip3] nvidia-nccl-cu13==2.29.7 [pip3] nvidia-nvjitlink==13.0.88 [pip3] nvidia-nvtx==13.0.85 [pip3] torch==2.13.0.dev20260521+cu130 [pip3] torchaudio==2.11.0.dev20260525+cu130 [pip3] torchvision==0.28.0.dev20260525+cu130 [pip3] triton==3.7.0+git88b227e2 [conda] numpy 2.2.6 pypi_0 pypi [conda] nvidia-cublas 13.1.1.3 pypi_0 pypi [conda] nvidia-cuda-cupti 13.0.85 pypi_0 pypi [conda] nvidia-cuda-nvrtc 13.0.88 pypi_0 pypi [conda] nvidia-cuda-runtime 13.0.96 pypi_0 pypi [conda] nvidia-cudnn-cu13 9.20.0.48 pypi_0 pypi [conda] nvidia-cufft 12.0.0.61 pypi_0 pypi [conda] nvidia-curand 10.4.0.35 pypi_0 pypi [conda] nvidia-cusolver 12.0.4.66 pypi_0 pypi [conda] nvidia-cusparse 12.6.3.3 pypi_0 pypi [conda] nvidia-cusparselt-cu13 0.8.1 pypi_0 pypi [conda] nvidia-nccl-cu13 2.29.7 pypi_0 pypi [conda] nvidia-nvjitlink 13.0.88 pypi_0 pypi [conda] nvidia-nvtx 13.0.85 pypi_0 pypi [conda] torch 2.13.0.dev20260521+cu130 pypi_0 pypi [conda] torchaudio 2.11.0.dev20260525+cu130 pypi_0 pypi [conda] torchvision 0.28.0.dev20260525+cu130 pypi_0 pypi [conda] triton 3.7.0+git88b227e2 pypi_0 pypi

cc @jianyuh @nikitaved @mruberry @walterddr @Lezcano

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - 💡(How to fix) Fix [Bug] torch.kron fails with RuntimeError on non-contiguous inputs