pytorch - 💡(How to fix) Fix [pt2] RuntimeError: variable modified by inplace operation during backward in compiled mode (succeeds in eager)

Q: Expected behavior

`torch.compile` should match eager mode behavior and execute the backward pass successfully without throwing the inplace modification `RuntimeError`.

pytorch2026-05-28 10:48:42

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

import torch import torch.nn.functional as F import traceback

def f(x, mean, std): y = F.interpolate( x[:, :64, :64].unsqueeze(0), size=(32, 32), mode="bilinear", align_corners=False, ) x.sub_(mean).div_(std) return y, x

def run(name, fn): print(f"\n=== {name} ===")

base = torch.randn(1, 3, 128, 128, device="cuda", requires_grad=True)
x = (base * 1.0).squeeze(0)[:, :128, :128]

mean = torch.randn(3, 1, 1, device="cuda", requires_grad=True)
std = torch.randn(3, 1, 1, device="cuda").abs().add(0.5).requires_grad_()

try:
    y, z = fn(x, mean, std)
    loss = y.sum() + z.sum()
    loss.backward()
    torch.cuda.synchronize()
    print("PASS")
except Exception:
    print("FAIL")
    traceback.print_exc()

def main(): print("torch:", torch.version) print("cuda:", torch.version.cuda)

run("eager", f)
run("compiled", torch.compile(f, backend="inductor", fullgraph=True))

if name == "main": main()

Code Example

import torch
import torch.nn.functional as F
import traceback


def f(x, mean, std):
    y = F.interpolate(
        x[:, :64, :64].unsqueeze(0),
        size=(32, 32),
        mode="bilinear",
        align_corners=False,
    )
    x.sub_(mean).div_(std)
    return y, x


def run(name, fn):
    print(f"\n=== {name} ===")

    base = torch.randn(1, 3, 128, 128, device="cuda", requires_grad=True)
    x = (base * 1.0).squeeze(0)[:, :128, :128]

    mean = torch.randn(3, 1, 1, device="cuda", requires_grad=True)
    std = torch.randn(3, 1, 1, device="cuda").abs().add(0.5).requires_grad_()

    try:
        y, z = fn(x, mean, std)
        loss = y.sum() + z.sum()
        loss.backward()
        torch.cuda.synchronize()
        print("PASS")
    except Exception:
        print("FAIL")
        traceback.print_exc()


def main():
    print("torch:", torch.__version__)
    print("cuda:", torch.version.cuda)

    run("eager", f)
    run("compiled", torch.compile(f, backend="inductor", fullgraph=True))


if __name__ == "__main__":
    main()

---

(torch-nightly) xyt19@Oasis:/tmp$ python bug.py
torch: 2.13.0.dev20260521+cu130
cuda: 13.0

=== eager ===
PASS

=== compiled ===
FAIL
Traceback (most recent call last):
  File "/tmp/bug.py", line 29, in run
    loss.backward()
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_tensor.py", line 633, in backward
    torch.autograd.backward(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/autograd/__init__.py", line 395, in backward
    _engine_run_backward(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/autograd/graph.py", line 913, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/autograd/function.py", line 333, in apply_boxed
    return self._get_user_fn()(self, *args)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 3454, in backward
    return CompiledFunction._bwd_fn(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/subclass_codegen.py:codegen(compiled_function_backward)", line 10, in _compiled_backward
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3, 128, 128]] is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True, check_nan=False).

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

When applying torch.compile to a function that performs F.interpolate followed by an inplace operation (like .sub_().div_()), the backward pass fails with a RuntimeError complaining that a variable needed for gradient computation has been modified inplace.

However, the exact same code executes successfully in eager mode without any inplace modification errors.

To Reproduce

Here is the minimal reproducible script:

import torch
import torch.nn.functional as F
import traceback


def f(x, mean, std):
    y = F.interpolate(
        x[:, :64, :64].unsqueeze(0),
        size=(32, 32),
        mode="bilinear",
        align_corners=False,
    )
    x.sub_(mean).div_(std)
    return y, x


def run(name, fn):
    print(f"\n=== {name} ===")

    base = torch.randn(1, 3, 128, 128, device="cuda", requires_grad=True)
    x = (base * 1.0).squeeze(0)[:, :128, :128]

    mean = torch.randn(3, 1, 1, device="cuda", requires_grad=True)
    std = torch.randn(3, 1, 1, device="cuda").abs().add(0.5).requires_grad_()

    try:
        y, z = fn(x, mean, std)
        loss = y.sum() + z.sum()
        loss.backward()
        torch.cuda.synchronize()
        print("PASS")
    except Exception:
        print("FAIL")
        traceback.print_exc()


def main():
    print("torch:", torch.__version__)
    print("cuda:", torch.version.cuda)

    run("eager", f)
    run("compiled", torch.compile(f, backend="inductor", fullgraph=True))


if __name__ == "__main__":
    main()

Error logs

Output from the script:

(torch-nightly) xyt19@Oasis:/tmp$ python bug.py
torch: 2.13.0.dev20260521+cu130
cuda: 13.0

=== eager ===
PASS

=== compiled ===
FAIL
Traceback (most recent call last):
  File "/tmp/bug.py", line 29, in run
    loss.backward()
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_tensor.py", line 633, in backward
    torch.autograd.backward(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/autograd/__init__.py", line 395, in backward
    _engine_run_backward(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/autograd/graph.py", line 913, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/autograd/function.py", line 333, in apply_boxed
    return self._get_user_fn()(self, *args)
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 3454, in backward
    return CompiledFunction._bwd_fn(
  File "/home/xyt19/miniconda3/envs/torch-nightly/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/subclass_codegen.py:codegen(compiled_function_backward)", line 10, in _compiled_backward
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3, 128, 128]] is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True, check_nan=False).

Expected behavior

torch.compile should match eager mode behavior and execute the backward pass successfully without throwing the inplace modification RuntimeError.

Versions

PyTorch version: 2.13.0.dev20260521+cu130 Is debug build: False CUDA used to build PyTorch: 13.0 ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.4 LTS (x86_64) GCC version: (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0 Clang version: 18.1.3 (1ubuntu1) CMake version: version 3.28.3 Libc version: glibc-2.39

Python version: 3.10.20 (main, Mar 11 2026, 17:46:40) [GCC 14.3.0] (64-bit runtime) Python platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.39 Is CUDA available: True CUDA runtime version: 12.0.140 Nvidia driver version: 596.49 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_engines_tensor_ir.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.21.1 /usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.21.1 Is XPU available: False HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Caching allocator config: N/A ersions of relevant libraries: [pip3] numpy==2.2.6 [pip3] nvidia-cublas==13.1.1.3 [pip3] nvidia-cuda-cupti==13.0.85 [pip3] nvidia-cuda-nvrtc==13.0.88 [pip3] nvidia-cuda-runtime==13.0.96 [pip3] nvidia-cudnn-cu13==9.20.0.48 [pip3] nvidia-cufft==12.0.0.61 [pip3] nvidia-curand==10.4.0.35 [pip3] nvidia-cusolver==12.0.4.66 [pip3] nvidia-cusparse==12.6.3.3 [pip3] nvidia-cusparselt-cu13==0.8.1 [pip3] nvidia-nccl-cu13==2.29.7 [pip3] nvidia-nvjitlink==13.0.88 [pip3] nvidia-nvtx==13.0.85 [pip3] torch==2.13.0.dev20260521+cu130 [pip3] torchaudio==2.11.0.dev20260525+cu130 [pip3] torchvision==0.28.0.dev20260525+cu130 [pip3] triton==3.7.0+git88b227e2 [conda] numpy 2.2.6 pypi_0 pypi [conda] nvidia-cublas 13.1.1.3 pypi_0 pypi [conda] nvidia-cuda-cupti 13.0.85 pypi_0 pypi [conda] nvidia-cuda-nvrtc 13.0.88 pypi_0 pypi [conda] nvidia-cuda-runtime 13.0.96 pypi_0 pypi [conda] nvidia-cudnn-cu13 9.20.0.48 pypi_0 pypi [conda] nvidia-cufft 12.0.0.61 pypi_0 pypi [conda] nvidia-curand 10.4.0.35 pypi_0 pypi [conda] nvidia-cusolver 12.0.4.66 pypi_0 pypi [conda] nvidia-cusparse 12.6.3.3 pypi_0 pypi [conda] nvidia-cusparselt-cu13 0.8.1 pypi_0 pypi [conda] nvidia-nccl-cu13 2.29.7 pypi_0 pypi [conda] nvidia-nvjitlink 13.0.88 pypi_0 pypi [conda] nvidia-nvtx 13.0.85 pypi_0 pypi [conda] torch 2.13.0.dev20260521+cu130 pypi_0 pypi [conda] torchaudio 2.11.0.dev20260525+cu130 pypi_0 pypi [conda] torchvision 0.28.0.dev20260525+cu130 pypi_0 pypi [conda] triton 3.7.0+git88b227e2 pypi_0 pypi

cc @chauhang @penguinwu @bdhirsh @bobrenjc93 @aorenste

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

torch.compile should match eager mode behavior and execute the backward pass successfully without throwing the inplace modification RuntimeError.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix [pt2] RuntimeError: variable modified by inplace operation during backward in compiled mode (succeeds in eager)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

To Reproduce

Error logs

Expected behavior

Versions

FAQ

Expected behavior

Still need to ship something?

TRENDING