pytorch - ✅(Solved) Fix [inductor] torch.compile produces incorrect result when applying `logit` on sliced tensor with specific shape [1 pull requests, 3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#177839Fetched 2026-04-08 01:03:17
View on GitHub
Comments
3
Participants
3
Timeline
126
Reactions
0
Author
Timeline (top)
mentioned ×53subscribed ×53labeled ×9unlabeled ×4

Error Message

import torch import itertools

def fn(x): return torch.special.logit(x, eps=0.51) # Condition: eps > 0.5

cfunc = torch.compile(fn, backend='eager')

Inconsistency-triggering condition: shape1 >= 2 and shape2 - cut_value >= 16.

for shape1, shape2, cut_value in itertools.product(range(1, 3), range(15, 21), range(1, 6)): shape = (shape1, shape2) x1 = torch.full(shape, 0.3) x1 = x1[:, :(shape2 - cut_value)] # No inconsistency if removing the slicing. x2 = x1.clone()

out1 = fn(x1)
out2 = cfunc(x2)

try:
    torch.testing.assert_close(out1, out2, equal_nan=True)
    print(f'Passed for shape={shape}, cut_value={cut_value}')
except AssertionError:
    print(f'out1={out1},\nout2={out2}')
    print(f'Failed for shape={shape}, cut_value={cut_value}')

Fix Action

Fixed

PR fix notes

PR #177853: [cpu] Fix vectorized logit mismatch for large eps

Description (problem / solution / changelog)

Fix #177839

Summary

  1. Root cause The CPU vectorized aten::logit path used vec::clamp for the eps bounds. When eps > 0.5, that path can collapse to 1 - eps, while the scalar/MKL kernel and the existing torch.compile decomposition preserve the ordered x < eps / x > 1 - eps checks. That made eager CPU results depend on layout/vectorization and disagree with compiled backends for sliced tensors.

  2. Proposed fix Replace the vectorized vec::clamp call with the same ordered comparisons used by the scalar kernel, and add a regression test for the sliced non-contiguous repro across the eager, aot_eager, and inductor compile backends.

  3. Why this is the right long term fix This removes layout-dependent behavior from the native CPU implementation and keeps CPU eager and torch.compile aligned with the logit semantics established by the earlier decomposition fix, instead of letting vectorization silently choose a different answer for the same operator.

Drafted via Codex, published after manual review by @bobrenjc93

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @aditew01 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

Changed files

  • aten/src/ATen/native/cpu/UnaryOpsKernel.cpp (modified, +4/-1)
  • test/distributed/tensor/test_decompositions.py (modified, +1/-2)
  • test/distributed/tensor/test_dtensor_ops.py (modified, +0/-1)
  • test/inductor/test_cpu_repro.py (modified, +14/-0)
  • torch/distributed/tensor/_ops/_tensor_ops.py (modified, +32/-0)

Code Example

import torch
import itertools

def fn(x):
    return torch.special.logit(x, eps=0.51)  # Condition: eps > 0.5
    
cfunc = torch.compile(fn, backend='eager')
    
# Inconsistency-triggering condition: shape1 >= 2 and shape2 - cut_value >= 16.
for shape1, shape2, cut_value in itertools.product(range(1, 3), range(15, 21), range(1, 6)):
    shape = (shape1, shape2)
    x1 = torch.full(shape, 0.3)
    x1 = x1[:, :(shape2 - cut_value)]  # No inconsistency if removing the slicing.
    x2 = x1.clone()
    
    out1 = fn(x1)
    out2 = cfunc(x2)
    
    try:
        torch.testing.assert_close(out1, out2, equal_nan=True)
        print(f'Passed for shape={shape}, cut_value={cut_value}')
    except AssertionError:
        print(f'out1={out1},\nout2={out2}')
        print(f'Failed for shape={shape}, cut_value={cut_value}')

---

....
out1=tensor([[-0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400,
         -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400],
        [-0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400,
         -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400]]),
out2=tensor([[0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400,
         0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400],
        [0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400,
         0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400]])
Failed for shape=(2, 18), cut_value=2
Passed for shape=(2, 18), cut_value=3
Passed for shape=(2, 18), cut_value=4
Passed for shape=(2, 18), cut_value=5
...
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

When executing a compiled torch.special.logit (with eps>0.5) using tensors in specific shape, the outputs are inconsistent between torch.compile and eager.

Here is the code for analysis and reproduce:

import torch
import itertools

def fn(x):
    return torch.special.logit(x, eps=0.51)  # Condition: eps > 0.5
    
cfunc = torch.compile(fn, backend='eager')
    
# Inconsistency-triggering condition: shape1 >= 2 and shape2 - cut_value >= 16.
for shape1, shape2, cut_value in itertools.product(range(1, 3), range(15, 21), range(1, 6)):
    shape = (shape1, shape2)
    x1 = torch.full(shape, 0.3)
    x1 = x1[:, :(shape2 - cut_value)]  # No inconsistency if removing the slicing.
    x2 = x1.clone()
    
    out1 = fn(x1)
    out2 = cfunc(x2)
    
    try:
        torch.testing.assert_close(out1, out2, equal_nan=True)
        print(f'Passed for shape={shape}, cut_value={cut_value}')
    except AssertionError:
        print(f'out1={out1},\nout2={out2}')
        print(f'Failed for shape={shape}, cut_value={cut_value}')

Output:

....
out1=tensor([[-0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400,
         -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400],
        [-0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400,
         -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400, -0.0400]]),
out2=tensor([[0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400,
         0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400],
        [0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400,
         0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400, 0.0400]])
Failed for shape=(2, 18), cut_value=2
Passed for shape=(2, 18), cut_value=3
Passed for shape=(2, 18), cut_value=4
Passed for shape=(2, 18), cut_value=5
...

The inconsistency triggering condition is: shape1>=2 && shape2-cut_value>=16 && eps>0.5. Although the result of logit seems undefined when eps > 0.5 (based on the equation: https://docs.pytorch.org/docs/stable/special.html#torch.special.logit), it seems weird that logit passes on specific tensor shape while fails on other tensor shapes. Note that backends eager, aot_eager, and inductor have this issue.

Error logs

No response

Versions

Versions of relevant libraries: [pip3] numpy==2.4.2 [pip3] torch==2.12.0.dev20260316+cpu [pip3] torchvision==0.26.0.dev20260223+cpu [pip3] triton==3.6.0 [conda] numpy 2.4.2 pypi_0 pypi [conda] torch 2.12.0.dev20260316+cpu pypi_0 pypi [conda] torchvision 0.26.0.dev20260223+cpu pypi_0 pypi [conda] triton 3.6.0 pypi_0 pypi

cc @chauhang @penguinwu

extent analysis

Fix Plan

To fix the inconsistency issue between torch.compile and eager execution of torch.special.logit with eps > 0.5, we need to ensure that the eps value is within the valid range.

Here are the steps to fix the issue:

  • Check the eps value before calling torch.special.logit.
  • If eps > 0.5, either reduce the eps value or handle the case where logit is undefined.

Example code:

import torch

def fn(x, eps):
    if eps > 0.5:
        # Handle the case where logit is undefined
        # For example, clip the eps value to a valid range
        eps = torch.clamp(eps, max=

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - ✅(Solved) Fix [inductor] torch.compile produces incorrect result when applying `logit` on sliced tensor with specific shape [1 pull requests, 3 comments, 3 participants]