pytorch - 💡(How to fix) Fix [MPS] Boolean indexing returns inconsistent tensor sizes on repeated calls [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#181867Fetched 2026-04-30 06:18:03
View on GitHub
Comments
0
Participants
1
Timeline
35
Reactions
0
Author
Participants
Timeline (top)
mentioned ×15subscribed ×15labeled ×5

Code Example

import torch
import platform

print("torch:", torch.__version__)
print("python:", platform.python_version())
print("platform:", platform.platform())
print("mps available:", torch.backends.mps.is_available())

device = "mps"
torch.manual_seed(0)

n = 11_842_339
m = 6_054_691

x = torch.randn(n, device=device)

mask = torch.zeros(n, dtype=torch.bool, device=device)
mask[:m] = True
mask = mask[torch.randperm(n, device=device)]

expected = mask.sum().item()
print("expected:", expected)

for step in range(10_000):
    a = x[mask]
    b = x[mask]

    if a.numel() != expected or b.numel() != expected or a.numel() != b.numel():
        print("BUG FOUND")
        print("step:", step)
        print("expected:", expected)
        print("a numel:", a.numel())
        print("b numel:", b.numel())
        break
else:
    print("No bug reproduced")

---

step: 228
expected: 6054691
actual: 7123614
y shape: torch.Size([7123614])
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

On the MPS backend, boolean indexing with the same 1D tensor and the same 1D boolean mask can return tensors with inconsistent sizes across repeated calls.

x[mask].numel() is expected to always equal mask.sum(), but on MPS it sometimes returns a different size. The same code works correctly on CPU.

Minimal repro:

import torch
import platform

print("torch:", torch.__version__)
print("python:", platform.python_version())
print("platform:", platform.platform())
print("mps available:", torch.backends.mps.is_available())

device = "mps"
torch.manual_seed(0)

n = 11_842_339
m = 6_054_691

x = torch.randn(n, device=device)

mask = torch.zeros(n, dtype=torch.bool, device=device)
mask[:m] = True
mask = mask[torch.randperm(n, device=device)]

expected = mask.sum().item()
print("expected:", expected)

for step in range(10_000):
    a = x[mask]
    b = x[mask]

    if a.numel() != expected or b.numel() != expected or a.numel() != b.numel():
        print("BUG FOUND")
        print("step:", step)
        print("expected:", expected)
        print("a numel:", a.numel())
        print("b numel:", b.numel())
        break
else:
    print("No bug reproduced")

Observed result on MPS:

step: 228
expected: 6054691
actual: 7123614
y shape: torch.Size([7123614])

platform: macOS-26.4.1-arm64-arm-64bit-Mach-O

Versions

PyTorch version: 2.9.1 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: macOS 26.4.1 (arm64) GCC version: Could not collect Clang version: 21.0.0 (clang-2100.0.123.102) CMake version: version 4.3.0 Libc version: N/A

Python version: 3.13.9 | packaged by Anaconda, Inc. | (main, Oct 21 2025, 19:11:29) [Clang 20.1.8 ] (64-bit runtime) Python platform: macOS-26.4.1-arm64-arm-64bit-Mach-O Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA Is XPU available: False HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Caching allocator config: N/A

CPU: Apple M4

Versions of relevant libraries: [pip3] numpy==2.4.1 [pip3] torch==2.9.1 [pip3] torchaudio==2.9.1 [pip3] torchvision==0.24.1 [conda] numpy 2.3.5 pypi_0 pypi [conda] numpy-base 2.4.1 py313h23175f9_0 [conda] torch 2.9.1 pypi_0 pypi [conda] torchaudio 2.9.1 pypi_0 pypi [conda] torchvision 0.24.1 pypi_0 pypi

cc @kulinseth @malfet @DenisVieriu97 @jhavukainen @aditvenk

extent analysis

TL;DR

The issue can be mitigated by updating PyTorch to a version where the MPS backend bug is fixed, if available, or by using a workaround such as using the CPU backend for boolean indexing operations.

Guidance

  • Verify that the issue is specific to the MPS backend by running the same code on the CPU backend.
  • Try to reproduce the issue with different input sizes and seeds to understand the conditions under which the bug occurs.
  • Consider using the CPU backend for boolean indexing operations as a temporary workaround.
  • Check the PyTorch issue tracker and release notes for any updates or fixes related to the MPS backend and boolean indexing.

Example

No code snippet is provided as the issue is related to a specific PyTorch version and backend, and the minimal reproducible example is already given in the issue body.

Notes

The issue seems to be specific to the MPS backend on macOS, and the cause is likely related to a bug in the PyTorch implementation. Updating PyTorch to a newer version or using a different backend may resolve the issue.

Recommendation

Apply workaround: use the CPU backend for boolean indexing operations until a fixed version of PyTorch is available, as this will ensure consistent results.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - 💡(How to fix) Fix [MPS] Boolean indexing returns inconsistent tensor sizes on repeated calls [1 participants]