pytorch - 💡(How to fix) Fix torch.compile/Inductor produces different result for float16 cast before elementwise add [1 comments, 2 participants]

pytorch2026-05-01 16:15:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#182131•Fetched 2026-05-02 05:27:01

View on GitHub

Comments

Participants

Timeline

Reactions

Author

rookieLiu2018

Participants

dsashidh

rookieLiu2018

Timeline (top)

mentioned ×36subscribed ×36labeled ×6commented ×1

Error Message

eager sum: tensor(20.5996) compiled sum: tensor(20.6000) max diff: tensor(0.0010)

AssertionError Traceback (most recent call last) /tmp/ipykernel_37381/3887225747.py in <cell line: 0>() 18 print("max diff: ", (eager[0].float() - compiled[0].float()).abs().max()) 19 ---> 20 torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0) 21 torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

/usr/local/lib/python3.12/dist-packages/torch/testing/_comparison.py in assert_close(actual, expected, allow_subclasses, rtol, atol, equal_nan, check_device, check_dtype, check_layout, check_stride, msg) 1629 if error_metas: 1630 # TODO: compose all metas into one AssertionError -> 1631 raise error_metas[0].to_error(msg) 1632 1633

AssertionError: Tensor-likes are not equal!

Mismatched elements: 8 / 24 (33.3%) Greatest absolute difference: 0.0009765625 at index (1, 2, 2) Greatest relative difference: 0.0012197494506835938 at index (0, 1, 0)

Root Cause

The compiled result should match eager execution because both operands are explicitly converted to float16 before the add.

Code Example

import torch

print(torch.__version__)

x = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4) / 10
c = (torch.arange(24) % 5 == 0).reshape(2, 3, 4)

def f(x, c):
    z = torch.where(c, torch.full_like(x, 0.5), torch.full_like(x, -0.5))
    y = z.to(torch.float16) + x.to(torch.float16)
    return y, y.float().sum()

eager = f(x, c)
compiled = torch.compile(f, backend="inductor", fullgraph=True)(x, c)

print("eager sum:   ", eager[1])
print("compiled sum:", compiled[1])
print("max diff:    ", (eager[0].float() - compiled[0].float()).abs().max())

torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0)
torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

---

eager sum:    tensor(20.5996)
compiled sum: tensor(20.6000)
max diff:     tensor(0.0010)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
[/tmp/ipykernel_37381/3887225747.py](https://localhost:8080/#) in <cell line: 0>()
     18 print("max diff:    ", (eager[0].float() - compiled[0].float()).abs().max())
     19 
---> 20 torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0)
     21 torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

[/usr/local/lib/python3.12/dist-packages/torch/testing/_comparison.py](https://localhost:8080/#) in assert_close(actual, expected, allow_subclasses, rtol, atol, equal_nan, check_device, check_dtype, check_layout, check_stride, msg)
   1629     if error_metas:
   1630         # TODO: compose all metas into one AssertionError
-> 1631         raise error_metas[0].to_error(msg)
   1632 
   1633 

AssertionError: Tensor-likes are not equal!

Mismatched elements: 8 / 24 (33.3%)
Greatest absolute difference: 0.0009765625 at index (1, 2, 2)
Greatest relative difference: 0.0012197494506835938 at index (0, 1, 0)

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

torch.compile with the Inductor backend produces a different result from eager execution when both operands are explicitly cast to float16 before an elementwise add.

The mismatch is visible in the intermediate tensor, not only in the final reduction.

Reproducer

import torch

print(torch.__version__)

x = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4) / 10
c = (torch.arange(24) % 5 == 0).reshape(2, 3, 4)

def f(x, c):
    z = torch.where(c, torch.full_like(x, 0.5), torch.full_like(x, -0.5))
    y = z.to(torch.float16) + x.to(torch.float16)
    return y, y.float().sum()

eager = f(x, c)
compiled = torch.compile(f, backend="inductor", fullgraph=True)(x, c)

print("eager sum:   ", eager[1])
print("compiled sum:", compiled[1])
print("max diff:    ", (eager[0].float() - compiled[0].float()).abs().max())

torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0)
torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

Example output:

eager sum:    tensor(20.5996)
compiled sum: tensor(20.6000)
max diff:     tensor(0.0010)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
[/tmp/ipykernel_37381/3887225747.py](https://localhost:8080/#) in <cell line: 0>()
     18 print("max diff:    ", (eager[0].float() - compiled[0].float()).abs().max())
     19 
---> 20 torch.testing.assert_close(compiled[0], eager[0], rtol=0, atol=0)
     21 torch.testing.assert_close(compiled[1], eager[1], rtol=0, atol=0)

[/usr/local/lib/python3.12/dist-packages/torch/testing/_comparison.py](https://localhost:8080/#) in assert_close(actual, expected, allow_subclasses, rtol, atol, equal_nan, check_device, check_dtype, check_layout, check_stride, msg)
   1629     if error_metas:
   1630         # TODO: compose all metas into one AssertionError
-> 1631         raise error_metas[0].to_error(msg)
   1632 
   1633 

AssertionError: Tensor-likes are not equal!

Mismatched elements: 8 / 24 (33.3%)
Greatest absolute difference: 0.0009765625 at index (1, 2, 2)
Greatest relative difference: 0.0012197494506835938 at index (0, 1, 0)

The compiled result should match eager execution because both operands are explicitly converted to float16 before the add.

Versions

PyTorch version: 2.10.0+cpu

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

extent analysis

TL;DR

The issue can be mitigated by relaxing the tolerance in torch.testing.assert_close to account for floating-point precision differences between eager and compiled execution.

Guidance

The discrepancy between eager and compiled execution results from the differing handling of floating-point operations, specifically the addition of float16 values.
To verify the issue, compare the results of eager and compiled execution using a relaxed tolerance, such as rtol=1e-4 and `atol=1e-

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#batch processing #GPU compatibility #latency issue #model loading #dependency error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix torch.compile/Inductor produces different result for float16 cast before elementwise add [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

eager sum: tensor(20.5996) compiled sum: tensor(20.6000) max diff: tensor(0.0010)

Root Cause

Code Example

🐛 Describe the bug

Reproducer

Versions

extent analysis

TL;DR

Guidance

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix torch.compile/Inductor produces different result for float16 cast before elementwise add [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

eager sum: tensor(20.5996) compiled sum: tensor(20.6000) max diff: tensor(0.0010)

Root Cause

Code Example

🐛 Describe the bug

Reproducer

Versions

extent analysis

TL;DR

Guidance

Still need to ship something?

RELATED_DISCOVERY

TRENDING