pytorch - 💡(How to fix) Fix `torch.compile(dynamic=True)` fails on `F.cross_entropy` with probability targets and class weight due to symbolic numel() [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#181870Fetched 2026-04-30 06:18:00
View on GitHub
Comments
0
Participants
1
Timeline
18
Reactions
0
Participants
Timeline (top)
mentioned ×6subscribed ×6labeled ×5cross-referenced ×1

Error Message

EAGER_OK torch.Size([3])

TorchRuntimeError Traceback (most recent call last) /tmp/ipykernel_16465/3749592953.py in <cell line: 0>() 18 19 compiled = torch.compile(fn, dynamic=True, backend="eager") ---> 20 compiled_out = compiled(x, target, weight) 21 print("COMPILED_OK", compiled_out.shape)

30 frames /usr/local/lib/python3.12/dist-packages/torch/_dynamo/exc.py in unimplemented(gb_type, context, explanation, hints, from_exc, log_warning, skip_frame) 651 raise Unsupported(msg, gb_type, skip_frame, real_stack=past_real_stack) 652 # noqa: GB_REGISTRY --> 653 raise Unsupported( 654 msg, gb_type, skip_frame, real_stack=past_real_stack 655 ) from from_exc

TorchRuntimeError: RuntimeError when making fake tensor call Explanation: Dynamo failed to run FX node with fake tensors: call_function <function cross_entropy at 0x7944d4bcf100>(*(FakeTensor(..., size=(3, 5)), FakeTensor(..., size=(3, 5))), *{'weight': FakeTensor(..., size=(5,)), 'reduction': 'none'}): got RuntimeError('Cannot call numel() on tensor with symbolic sizes/strides\nException raised from throw_cannot_call_with_symbolic at /pytorch/c10/core/TensorImpl.cpp:340 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x9d (0x7944dc17e8dd in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #1: c10::TensorImpl::throw_cannot_call_with_symbolic(char const) const + 0x8b (0x7944dc10b811 in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #2: <unknown function> + 0x7eb6c (0x7944dc157b6c in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #3: at::native::cross_entropy_loss_symint(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, long, c10::SymInt, double) + 0x21e3 (0x7944ad083003 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cpu.so)\nframe #4: <unknown function> + 0x33a7e42 (0x7944ae43ae42 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cpu.so)\nframe #5: <unknown function> + 0x740e8d (0x7944d6222e8d in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_python.so)\nframe #6: <unknown function> + 0x1ca7ad1 (0x794... Hint: Your code may result in an error when running in eager. Please double check that your code doesn't contain a similar error when actually running eager/uncompiled. You can do this by removing the torch.compile call, or by using torch.compiler.set_stance("force_eager").

Developer debug context:

For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb4315.html

from user code: File "/tmp/ipykernel_16465/3749592953.py", line 5, in fn return F.cross_entropy(

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

Code Example

import torch
import torch.nn.functional as F

def fn(x, target, weight):
    return F.cross_entropy(
        x,
        target,
        weight=weight,
        reduction="none",
    )

x = torch.randn(3, 5)
target = torch.rand(3, 5).softmax(dim=1)
weight = torch.rand(5)

eager_out = fn(x, target, weight)
print("EAGER_OK", eager_out.shape)

compiled = torch.compile(fn, dynamic=True, backend="eager")
compiled_out = compiled(x, target, weight)
print("COMPILED_OK", compiled_out.shape)

---

EAGER_OK torch.Size([3])
---------------------------------------------------------------------------
TorchRuntimeError                         Traceback (most recent call last)
[/tmp/ipykernel_16465/3749592953.py](https://localhost:8080/#) in <cell line: 0>()
     18 
     19 compiled = torch.compile(fn, dynamic=True, backend="eager")
---> 20 compiled_out = compiled(x, target, weight)
     21 print("COMPILED_OK", compiled_out.shape)

30 frames
[/usr/local/lib/python3.12/dist-packages/torch/_dynamo/exc.py](https://localhost:8080/#) in unimplemented(gb_type, context, explanation, hints, from_exc, log_warning, skip_frame)
    651             raise Unsupported(msg, gb_type, skip_frame, real_stack=past_real_stack)
    652         # noqa: GB_REGISTRY
--> 653         raise Unsupported(
    654             msg, gb_type, skip_frame, real_stack=past_real_stack
    655         ) from from_exc

TorchRuntimeError: RuntimeError when making fake tensor call
  Explanation: Dynamo failed to run FX node with fake tensors: call_function <function cross_entropy at 0x7944d4bcf100>(*(FakeTensor(..., size=(3, 5)), FakeTensor(..., size=(3, 5))), **{'weight': FakeTensor(..., size=(5,)), 'reduction': 'none'}): got RuntimeError('Cannot call numel() on tensor with symbolic sizes/strides\nException raised from throw_cannot_call_with_symbolic at /pytorch/c10/core/TensorImpl.cpp:340 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x9d (0x7944dc17e8dd in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #1: c10::TensorImpl::throw_cannot_call_with_symbolic(char const*) const + 0x8b (0x7944dc10b811 in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #2: <unknown function> + 0x7eb6c (0x7944dc157b6c in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #3: at::native::cross_entropy_loss_symint(at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, long, c10::SymInt, double) + 0x21e3 (0x7944ad083003 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cpu.so)\nframe #4: <unknown function> + 0x33a7e42 (0x7944ae43ae42 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cpu.so)\nframe #5: <unknown function> + 0x740e8d (0x7944d6222e8d in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_python.so)\nframe #6: <unknown function> + 0x1ca7ad1 (0x794...
  Hint: Your code may result in an error when running in eager. Please double check that your code doesn't contain a similar error when actually running eager/uncompiled. You can do this by removing the `torch.compile` call, or by using `torch.compiler.set_stance("force_eager")`. 

  Developer debug context: 

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb4315.html

from user code:
   File "/tmp/ipykernel_16465/3749592953.py", line 5, in fn
    return F.cross_entropy(

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

---

PyTorch: 2.10.0+cpu
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

torch.compile(dynamic=True) fails to trace torch.nn.functional.cross_entropy when using probability targets together with a per-class weight.

The same function works in eager mode. The failure also reproduces with backend="eager", so this appears to be a Dynamo/fake tensor / symbolic-shape tracing issue rather than an Inductor codegen issue.

This seems to come from the probability-target branch of cross_entropy_loss_symint. When input and target have the same symbolic shape, the implementation enters cross_entropy_loss_prob_target, where the weight validation calls weight.numel(). This concrete-shape query fails for fake tensors with symbolic sizes.

Reproducer

import torch
import torch.nn.functional as F

def fn(x, target, weight):
    return F.cross_entropy(
        x,
        target,
        weight=weight,
        reduction="none",
    )

x = torch.randn(3, 5)
target = torch.rand(3, 5).softmax(dim=1)
weight = torch.rand(5)

eager_out = fn(x, target, weight)
print("EAGER_OK", eager_out.shape)

compiled = torch.compile(fn, dynamic=True, backend="eager")
compiled_out = compiled(x, target, weight)
print("COMPILED_OK", compiled_out.shape)

output:

EAGER_OK torch.Size([3])
---------------------------------------------------------------------------
TorchRuntimeError                         Traceback (most recent call last)
[/tmp/ipykernel_16465/3749592953.py](https://localhost:8080/#) in <cell line: 0>()
     18 
     19 compiled = torch.compile(fn, dynamic=True, backend="eager")
---> 20 compiled_out = compiled(x, target, weight)
     21 print("COMPILED_OK", compiled_out.shape)

30 frames
[/usr/local/lib/python3.12/dist-packages/torch/_dynamo/exc.py](https://localhost:8080/#) in unimplemented(gb_type, context, explanation, hints, from_exc, log_warning, skip_frame)
    651             raise Unsupported(msg, gb_type, skip_frame, real_stack=past_real_stack)
    652         # noqa: GB_REGISTRY
--> 653         raise Unsupported(
    654             msg, gb_type, skip_frame, real_stack=past_real_stack
    655         ) from from_exc

TorchRuntimeError: RuntimeError when making fake tensor call
  Explanation: Dynamo failed to run FX node with fake tensors: call_function <function cross_entropy at 0x7944d4bcf100>(*(FakeTensor(..., size=(3, 5)), FakeTensor(..., size=(3, 5))), **{'weight': FakeTensor(..., size=(5,)), 'reduction': 'none'}): got RuntimeError('Cannot call numel() on tensor with symbolic sizes/strides\nException raised from throw_cannot_call_with_symbolic at /pytorch/c10/core/TensorImpl.cpp:340 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x9d (0x7944dc17e8dd in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #1: c10::TensorImpl::throw_cannot_call_with_symbolic(char const*) const + 0x8b (0x7944dc10b811 in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #2: <unknown function> + 0x7eb6c (0x7944dc157b6c in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #3: at::native::cross_entropy_loss_symint(at::Tensor const&, at::Tensor const&, std::optional<at::Tensor> const&, long, c10::SymInt, double) + 0x21e3 (0x7944ad083003 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cpu.so)\nframe #4: <unknown function> + 0x33a7e42 (0x7944ae43ae42 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cpu.so)\nframe #5: <unknown function> + 0x740e8d (0x7944d6222e8d in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_python.so)\nframe #6: <unknown function> + 0x1ca7ad1 (0x794...
  Hint: Your code may result in an error when running in eager. Please double check that your code doesn't contain a similar error when actually running eager/uncompiled. You can do this by removing the `torch.compile` call, or by using `torch.compiler.set_stance("force_eager")`. 

  Developer debug context: 

 For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb4315.html

from user code:
   File "/tmp/ipykernel_16465/3749592953.py", line 5, in fn
    return F.cross_entropy(

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

Versions

PyTorch: 2.10.0+cpu

cc @chauhang @penguinwu @ezyang @bobrenjc93 @aditvenk @laithsakka

extent analysis

TL;DR

The issue can be worked around by avoiding the use of torch.compile with dynamic tracing when using torch.nn.functional.cross_entropy with probability targets and per-class weights.

Guidance

  • The error occurs because weight.numel() is called on a tensor with symbolic sizes, which is not supported.
  • To verify the issue, run the provided reproducer code and observe the error message.
  • As a temporary workaround, consider removing the torch.compile call or using torch.compiler.set_stance("force_eager") to force eager execution.
  • Another possible mitigation is to avoid using dynamic tracing with torch.compile when working with torch.nn.functional.cross_entropy and probability targets.

Example

No code example is provided as the issue is related to the interaction between torch.compile and torch.nn.functional.cross_entropy, and the reproducer code is already given in the issue.

Notes

The issue seems to be related to the implementation of torch.nn.functional.cross_entropy and the handling of symbolic tensor sizes in torch.compile. The provided workaround may not be optimal but can help mitigate the issue until a proper fix is available.

Recommendation

Apply workaround: Avoid using torch.compile with dynamic tracing when working with torch.nn.functional.cross_entropy and probability targets, as this can cause errors due to the handling of symbolic tensor sizes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING