pytorch - 💡(How to fix) Fix `torch.compile(torch.vmap(torch.func.hessian(f)))` crashes with TorchRuntimeError

pytorch2026-05-09 00:55:28

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

TorchRuntimeError: RuntimeError when making fake tensor call
Explanation: Dynamo failed to run FX node with fake tensors:
  call_function <function _autograd_grad at 0x...>(
    *([GradTrackingTensor(lvl=-2, value=
        GradTrackingTensor(lvl=3, value=
          BatchedTensor(lvl=1, bdim=0, value=
            FakeTensor(..., device='cuda:0', size=(4,), dtype=torch.float64))))],
      ...))

Root Cause

The root cause appears to be that jacfwd as the outer transform, combined with vmap and compile, fails during fake tensor propagation. Specifically, hessian(f) = jacfwd(jacrev(f)), and the jacfwd outer layer triggers the crash.

Fix Action

Fix / Workaround

Key observations:

vmap(jacfwd(jacrev(f))) (= vmap(hessian(f))) — CRASHES
vmap(jacfwd(jacfwd(f))) — CRASHES
vmap(jacrev(jacrev(f))) — WORKS (workaround)
vmap(jacrev(jacfwd(f))) — WORKS
jacfwd(jacrev(f)) without vmap — WORKS
vmap(jacfwd(f)) without nesting — WORKS

WORKAROUND (works correctly)

compiled_workaround = torch.compile( torch.vmap(torch.func.jacrev(torch.func.jacrev(f))), fullgraph=True ) result = compiled_workaround(x)

Code Example

TorchRuntimeError: RuntimeError when making fake tensor call
Explanation: Dynamo failed to run FX node with fake tensors:
  call_function <function _autograd_grad at 0x...>(
    *([GradTrackingTensor(lvl=-2, value=
        GradTrackingTensor(lvl=3, value=
          BatchedTensor(lvl=1, bdim=0, value=
            FakeTensor(..., device='cuda:0', size=(4,), dtype=torch.float64))))],
      ...))

---

import torch

torch.manual_seed(42)
x = torch.randn(4, 8, device='cuda', dtype=torch.float64)

def f(x):
    return (x ** 4).sum()

# CRASHES
compiled = torch.compile(torch.vmap(torch.func.hessian(f)), fullgraph=True)
result = compiled(x)

# WORKAROUND (works correctly)
compiled_workaround = torch.compile(
    torch.vmap(torch.func.jacrev(torch.func.jacrev(f))),
    fullgraph=True
)
result = compiled_workaround(x)

---

PyTorch: 2.13.0.dev20260501+cu126
Python: 3.11
CUDA: 12.6
GPU: Tesla T4

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

torch.compile crashes when compiling a vmapped Hessian computation (torch.vmap(torch.func.hessian(f))). The crash occurs at the Dynamo level (not Inductor-specific — aot_eager backend also crashes).

Key observations:

vmap(jacfwd(jacrev(f))) (= vmap(hessian(f))) — CRASHES
vmap(jacfwd(jacfwd(f))) — CRASHES
vmap(jacrev(jacrev(f))) — WORKS (workaround)
vmap(jacrev(jacfwd(f))) — WORKS
jacfwd(jacrev(f)) without vmap — WORKS
vmap(jacfwd(f)) without nesting — WORKS

So the crash requires: jacfwd as outer + nested differentiation + vmap + compile.

Error message

TorchRuntimeError: RuntimeError when making fake tensor call
Explanation: Dynamo failed to run FX node with fake tensors:
  call_function <function _autograd_grad at 0x...>(
    *([GradTrackingTensor(lvl=-2, value=
        GradTrackingTensor(lvl=3, value=
          BatchedTensor(lvl=1, bdim=0, value=
            FakeTensor(..., device='cuda:0', size=(4,), dtype=torch.float64))))],
      ...))

To reproduce

import torch

torch.manual_seed(42)
x = torch.randn(4, 8, device='cuda', dtype=torch.float64)

def f(x):
    return (x ** 4).sum()

# CRASHES
compiled = torch.compile(torch.vmap(torch.func.hessian(f)), fullgraph=True)
result = compiled(x)

# WORKAROUND (works correctly)
compiled_workaround = torch.compile(
    torch.vmap(torch.func.jacrev(torch.func.jacrev(f))),
    fullgraph=True
)
result = compiled_workaround(x)

The crash also occurs with f(x) = sin(x).sum(), f(x) = exp(x).sum(), f(x) = (x**2).sum(), but NOT with f(x) = tanh(x).sum().

Versions

PyTorch: 2.13.0.dev20260501+cu126
Python: 3.11
CUDA: 12.6
GPU: Tesla T4

cc @chauhang @penguinwu @Chillee @samdow @kshitij12345

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#inference speed #output truncation #response parsing #generation error #database connection

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix `torch.compile(torch.vmap(torch.func.hessian(f)))` crashes with TorchRuntimeError

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

WORKAROUND (works correctly)

Code Example

🐛 Describe the bug

🐛 Describe the bug

Error message

To reproduce

Versions

Versions

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix `torch.compile(torch.vmap(torch.func.hessian(f)))` crashes with TorchRuntimeError

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

WORKAROUND (works correctly)

Code Example

🐛 Describe the bug

🐛 Describe the bug

Error message

To reproduce

Versions

Versions

Still need to ship something?

RELATED_DISCOVERY

TRENDING