pytorch - ✅(Solved) Fix Eager and compile disagree for integer floor_divide with zero divisor [1 pull requests, 2 comments, 1 participants]

pytorch2026-03-20 21:58:55

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#178013•Fetched 2026-04-08 01:07:40

View on GitHub

Comments

Participants

Timeline

Reactions

Author

dcci

Participants

dcci

Assignees

dcci

Timeline (top)

mentioned ×26subscribed ×26labeled ×12unlabeled ×5

Fix Action

Fixed

Fixed by PR: Fix eager/compiled mismatch for integer floor_divide with zero divisor (https://github.com/pytorch/pytorch/pull/178016)

PR fix notes

PR #178016: Fix eager/compiled mismatch for integer floor_divide with zero divisor

Repository: pytorch/pytorch
Author: dcci
State: open | merged: False
Link: https://github.com/pytorch/pytorch/pull/178016

Description (problem / solution / changelog)

Integer floor_divide on CUDA produced different results between eager and compiled paths when the divisor was zero, because both paths hit undefined behavior without any guard.

Eager (c10::div_floor_integer) executed a / b directly, where NVIDIA's 64-bit division emulation returned a truncated 32-bit result (0xFFFFFFFF for int64). Compiled (Triton floordiv) executed a // b via Triton's truncdiv, which returned -1.

Add a b == 0 early-return of 0 in both paths:

c10::div_floor_integer: simple b == 0 check before any arithmetic.
Triton floordiv codegen: replace b with 1 before the division, then select 0 for the final result. The guard must precede the division because LLVM may assume divisors are non-zero (UB-based optimization) and eliminate a post-division check entirely.

Fixes https://github.com/pytorch/pytorch/issues/178013

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo @mlazos

Changed files

c10/util/generic_math.h (modified, +4/-0)
test/inductor/test_torchinductor.py (modified, +28/-0)
torch/_inductor/codegen/triton.py (modified, +9/-1)

Code Example

import torch

t0 = torch.tensor([0.0], dtype=torch.float64).reshape((1,)).to(torch.int64).to("cuda:0")
t1 = torch.tensor([0.0], dtype=torch.float64).reshape((1,)).to(torch.int64).to("cuda:0")

# Eager
eager_out = torch.floor_divide(t0, t1)

# Compiled
torch._dynamo.reset()
def fn(t0, t1):
    return torch.floor_divide(t0, t1)

compiled_fn = torch.compile(fn, fullgraph=True)
compiled_out = compiled_fn(t0, t1)

# Compare
print('eager  :', eager_out)
print('compiled:', compiled_out)
print('equal  :', torch.equal(eager_out, compiled_out))
if eager_out.is_floating_point():
    diff = (eager_out.to(torch.float64) - compiled_out.to(torch.float64)).abs()
    print('max diff:', diff.max().item())

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

import torch

t0 = torch.tensor([0.0], dtype=torch.float64).reshape((1,)).to(torch.int64).to("cuda:0")
t1 = torch.tensor([0.0], dtype=torch.float64).reshape((1,)).to(torch.int64).to("cuda:0")

# Eager
eager_out = torch.floor_divide(t0, t1)

# Compiled
torch._dynamo.reset()
def fn(t0, t1):
    return torch.floor_divide(t0, t1)

compiled_fn = torch.compile(fn, fullgraph=True)
compiled_out = compiled_fn(t0, t1)

# Compare
print('eager  :', eager_out)
print('compiled:', compiled_out)
print('equal  :', torch.equal(eager_out, compiled_out))
if eager_out.is_floating_point():
    diff = (eager_out.to(torch.float64) - compiled_out.to(torch.float64)).abs()
    print('max diff:', diff.max().item())

Versions

torch trunk

cc @ezyang @gchanan @kadeng @msaroufim @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @muchulee8 @amjames @aakhundov @coconutruben @jataylo @ptrblck @eqy @jerryzh168 @tinglvv @nWEIdia

extent analysis

Fix Plan

The issue arises from the division by zero in the torch.floor_divide function. To fix this, we need to add a check to avoid division by zero.

Step-by-Step Solution

Check if the divisor is zero before performing the division
If the divisor is zero, handle the case accordingly (e.g., return a specific value or raise an exception)

Example Code

import torch

def safe_floor_divide(t0, t1):
    if t1 == 0:
        # Handle division by zero, e.g., return zero or raise an exception
        return torch.tensor(0, dtype=torch.int64).to("cuda:0")
    else:
        return torch.floor_divide(t0, t1)

t0 = torch.tensor([0.0], dtype=torch.float64).reshape((1,)).to(torch.int64).to("cuda:0")
t1 = torch.tensor([0.0], dtype=torch.float64).reshape((1,)).to(torch.int64).to("cuda:0")

# Eager
eager_out = safe_floor_divide(t0, t1)

# Compiled
torch._dynamo.reset()
def fn(t0, t1):
    return safe_floor_divide(t0, t1)

compiled_fn = torch.compile(fn, fullgraph=True)
compiled_out = compiled_fn(t0, t1)

# Compare
print('eager  :', eager_out)
print('compiled:', compiled_out)
print('equal  :', torch.equal(eager_out, compiled_out))

Verification

Run the modified code and verify that it handles the division by zero case correctly and produces the expected output.

Extra Tips

Always check for potential division by zero cases in your code to avoid unexpected behavior or errors.
Consider adding input validation and error handling to make your code more robust.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #orchestration issue #cache issue #memory leak #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - ✅(Solved) Fix Eager and compile disagree for integer floor_divide with zero divisor [1 pull requests, 2 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #178016: Fix eager/compiled mismatch for integer floor_divide with zero divisor

Description (problem / solution / changelog)

Changed files

Code Example

🐛 Describe the bug

Versions

extent analysis

Fix Plan

Step-by-Step Solution

Example Code

Verification

Extra Tips

Still need to ship something?

TRENDING

pytorch - ✅(Solved) Fix Eager and compile disagree for integer floor_divide with zero divisor [1 pull requests, 2 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #178016: Fix eager/compiled mismatch for integer floor_divide with zero divisor

Description (problem / solution / changelog)

Changed files

Code Example

🐛 Describe the bug

Versions

extent analysis

Fix Plan

Step-by-Step Solution

Example Code

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING