pytorch - 💡(How to fix) Fix torch.compile/Inductor produces different result for float16 cast before elementwise add [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#182131Fetched 2026-05-02 05:27:01
View on GitHub
Comments
1
Participants
2
Timeline
79
Reactions
0
Timeline (top)
mentioned ×36subscribed ×36labeled ×6commented ×1

Error Message

eager sum: tensor(20.5996) compiled sum: tensor(20.6000) max diff: tensor(0.0010)

AssertionError Traceback (most recent call last) /tmp/ipykernel_37381/3887225747.py in <cell line: 0>() 18 print("max diff: ", (eager[0].float() - compiled[0].float()).abs().max()) 19 ---> 20 torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0) 21 torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

/usr/local/lib/python3.12/dist-packages/torch/testing/_comparison.py in assert_close(actual, expected, allow_subclasses, rtol, atol, equal_nan, check_device, check_dtype, check_layout, check_stride, msg) 1629 if error_metas: 1630 # TODO: compose all metas into one AssertionError -> 1631 raise error_metas[0].to_error(msg) 1632 1633

AssertionError: Tensor-likes are not equal!

Mismatched elements: 8 / 24 (33.3%) Greatest absolute difference: 0.0009765625 at index (1, 2, 2) Greatest relative difference: 0.0012197494506835938 at index (0, 1, 0)

Root Cause

The compiled result should match eager execution because both operands are explicitly converted to float16 before the add.

Code Example

import torch

print(torch.__version__)

x = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4) / 10
c = (torch.arange(24) % 5 == 0).reshape(2, 3, 4)

def f(x, c):
    z = torch.where(c, torch.full_like(x, 0.5), torch.full_like(x, -0.5))
    y = z.to(torch.float16) + x.to(torch.float16)
    return y, y.float().sum()

eager = f(x, c)
compiled = torch.compile(f, backend="inductor", fullgraph=True)(x, c)

print("eager sum:   ", eager[1])
print("compiled sum:", compiled[1])
print("max diff:    ", (eager[0].float() - compiled[0].float()).abs().max())

torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0)
torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

---

eager sum:    tensor(20.5996)
compiled sum: tensor(20.6000)
max diff:     tensor(0.0010)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
[/tmp/ipykernel_37381/3887225747.py](https://localhost:8080/#) in <cell line: 0>()
     18 print("max diff:    ", (eager[0].float() - compiled[0].float()).abs().max())
     19 
---> 20 torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0)
     21 torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

[/usr/local/lib/python3.12/dist-packages/torch/testing/_comparison.py](https://localhost:8080/#) in assert_close(actual, expected, allow_subclasses, rtol, atol, equal_nan, check_device, check_dtype, check_layout, check_stride, msg)
   1629     if error_metas:
   1630         # TODO: compose all metas into one AssertionError
-> 1631         raise error_metas[0].to_error(msg)
   1632 
   1633 

AssertionError: Tensor-likes are not equal!

Mismatched elements: 8 / 24 (33.3%)
Greatest absolute difference: 0.0009765625 at index (1, 2, 2)
Greatest relative difference: 0.0012197494506835938 at index (0, 1, 0)
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

torch.compile with the Inductor backend produces a different result from eager execution when both operands are explicitly cast to float16 before an elementwise add.

The mismatch is visible in the intermediate tensor, not only in the final reduction.

Reproducer

import torch

print(torch.__version__)

x = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4) / 10
c = (torch.arange(24) % 5 == 0).reshape(2, 3, 4)

def f(x, c):
    z = torch.where(c, torch.full_like(x, 0.5), torch.full_like(x, -0.5))
    y = z.to(torch.float16) + x.to(torch.float16)
    return y, y.float().sum()

eager = f(x, c)
compiled = torch.compile(f, backend="inductor", fullgraph=True)(x, c)

print("eager sum:   ", eager[1])
print("compiled sum:", compiled[1])
print("max diff:    ", (eager[0].float() - compiled[0].float()).abs().max())

torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0)
torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

Example output:

eager sum:    tensor(20.5996)
compiled sum: tensor(20.6000)
max diff:     tensor(0.0010)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
[/tmp/ipykernel_37381/3887225747.py](https://localhost:8080/#) in <cell line: 0>()
     18 print("max diff:    ", (eager[0].float() - compiled[0].float()).abs().max())
     19 
---> 20 torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0)
     21 torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

[/usr/local/lib/python3.12/dist-packages/torch/testing/_comparison.py](https://localhost:8080/#) in assert_close(actual, expected, allow_subclasses, rtol, atol, equal_nan, check_device, check_dtype, check_layout, check_stride, msg)
   1629     if error_metas:
   1630         # TODO: compose all metas into one AssertionError
-> 1631         raise error_metas[0].to_error(msg)
   1632 
   1633 

AssertionError: Tensor-likes are not equal!

Mismatched elements: 8 / 24 (33.3%)
Greatest absolute difference: 0.0009765625 at index (1, 2, 2)
Greatest relative difference: 0.0012197494506835938 at index (0, 1, 0)

The compiled result should match eager execution because both operands are explicitly converted to float16 before the add.

Versions

PyTorch version: 2.10.0+cpu

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

extent analysis

TL;DR

The issue can be mitigated by relaxing the tolerance in torch.testing.assert_close to account for floating-point precision differences between eager and compiled execution.

Guidance

  • The discrepancy between eager and compiled execution results from the differing handling of floating-point operations, specifically the addition of float16 values.
  • To verify the issue, compare the results of eager and compiled execution using a relaxed tolerance, such as rtol=1e-4 and `atol=1e-

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING