pytorch - 💡(How to fix) Fix PyTorch Segmentation Fault on M2 Mac - Bug Report

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000008 Exception Codes: 0x0000000000000001, 0x0000000000000008 Termination Reason: Namespace SIGNAL, Code 11, Segmentation fault: 11

Root Cause

Root Cause: Memory access violation in OpenMP threading library (libomp.dylib) at NULL pointer address 0x0000000000000008.

Fix Action

Fix / Workaround

🔄 Workarounds Attempted

NONE of these workarounds resolved the crash.

Current Workaround

Cannot use PyTorch Neural Networks - must rely on scikit-learn and XGBoost models instead.

Code Example

torch==2.11.0
numpy==2.0+
pandas==2.0+
scikit-learn==1.3+

---

Exception Type:    EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000008
Exception Codes:   0x0000000000000001, 0x0000000000000008
Termination Reason: Namespace SIGNAL, Code 11, Segmentation fault: 11

---

Thread 0 Crashed:
0   libomp.dylib  void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 44
1   libomp.dylib  kmp_flag_64<false, true>::wait(kmp_info*, int, void*) + 1896
2   libomp.dylib  __kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) + 172
3   libomp.dylib  __kmp_fork_barrier(int, int) + 464
4   libomp.dylib  __kmp_launch_thread + 336
5   libomp.dylib  __kmp_launch_worker(void*) + 280

---

import torch
import torch.nn as nn
import numpy as np

# Generate data matching production shape
np.random.seed(42)
X_train = np.random.randn(1757, 67).astype(np.float32)
y_train = np.random.randint(0, 2, 1757)

# Convert to PyTorch tensors
X_tensor = torch.FloatTensor(X_train)
y_tensor = torch.LongTensor(y_train)

# Simple feedforward network
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(67, 128),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(32, 2)
        )
    
    def forward(self, x):
        return self.network(x)

# Training setup
model = SimpleNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# CRASH OCCURS HERE
model.train()
outputs = model(X_tensor)  # <-- SEGFAULT
loss = criterion(outputs, y_tensor)
loss.backward()
optimizer.step()

---

import torch
   x = torch.randn(100, 67)
   y = torch.nn.Linear(67, 128)(x)  # Works fine

---

X = torch.randn(100, 67)
   y = torch.randint(0, 2, (100,))
   # Training works with small data

---

device = torch.device('cpu')
   model = model.to(device)
   # Still crashes with real 1757-sample data

---

# Created clean environment
python3 -m venv test_env
source test_env/bin/activate
pip install torch numpy pandas scikit-learn

# Still crashes

---

# Test: Train NN first, then RF, then XGB
# Result: STILL CRASHES at NN step
# Conclusion: Not caused by other libraries

---

import tracemalloc
tracemalloc.start()

# Train NN here - CRASHES before memory snapshot

---

# Tried explicit CPU device
device = torch.device('cpu')
model.to(device)
X_tensor = X_tensor.to(device)
y_tensor = y_tensor.to(device)

# STILL CRASHES

---

# Tried simplest possible network
model = nn.Sequential(nn.Linear(67, 2))

# STILL CRASHES on 1757-sample dataset

---

from sklearn.preprocessing import StandardScaler
import pandas as pd

# Load from cache (SQLite)
data = pd.concat([ticker_data for ticker in ['SPY', 'QQQ', ...]])

# Feature engineering
features = engineer.add_all_features(data)
features = detector.add_regime_features(features)

# Prepare for ML
X = features.drop(['target', 'symbol'], axis=1)
y = features['target']

# Scale
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Convert to tensors - CRASHES HERE
X_tensor = torch.FloatTensor(X_scaled)

---

libomp.dylib: void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*)
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

PYTORCH_BUG_REPORT.md

PyTorch Segmentation Fault on M2 Mac - Bug Report

Date: May 10, 2026
Reporter: Trading System Developer
Severity: Critical - Process Crash (SIGSEGV)


🐛 Bug Summary

PyTorch causes segmentation fault (SIGSEGV) when training a simple feedforward neural network on M2 MacBook Pro with Python 3.14.0. The crash occurs during the first training epoch with real financial data (1757 samples, 67 features).


💻 Environment

Hardware

  • Model: MacBook Pro M2 (Mac14,10)
  • Chip: Apple M2 Pro (12 cores: 8 performance, 4 efficiency)
  • RAM: 16GB Unified Memory
  • Architecture: ARM-64 (Native)

Software

  • OS: macOS 26.4 (25E246)
  • Python: 3.14.0 (installed via Homebrew)
  • PyTorch: 2.11.0
  • Installation: pip install torch --break-system-packages

Dependencies

torch==2.11.0
numpy==2.0+
pandas==2.0+
scikit-learn==1.3+

🔥 Error Details

Crash Type

Exception Type:    EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000008
Exception Codes:   0x0000000000000001, 0x0000000000000008
Termination Reason: Namespace SIGNAL, Code 11, Segmentation fault: 11

Crash Location

Thread 0 Crashed:
0   libomp.dylib  void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 44
1   libomp.dylib  kmp_flag_64<false, true>::wait(kmp_info*, int, void*) + 1896
2   libomp.dylib  __kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) + 172
3   libomp.dylib  __kmp_fork_barrier(int, int) + 464
4   libomp.dylib  __kmp_launch_thread + 336
5   libomp.dylib  __kmp_launch_worker(void*) + 280

Root Cause: Memory access violation in OpenMP threading library (libomp.dylib) at NULL pointer address 0x0000000000000008.


🧪 Reproducible Test Case

Minimal Code

import torch
import torch.nn as nn
import numpy as np

# Generate data matching production shape
np.random.seed(42)
X_train = np.random.randn(1757, 67).astype(np.float32)
y_train = np.random.randint(0, 2, 1757)

# Convert to PyTorch tensors
X_tensor = torch.FloatTensor(X_train)
y_tensor = torch.LongTensor(y_train)

# Simple feedforward network
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(67, 128),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(32, 2)
        )
    
    def forward(self, x):
        return self.network(x)

# Training setup
model = SimpleNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# CRASH OCCURS HERE
model.train()
outputs = model(X_tensor)  # <-- SEGFAULT
loss = criterion(outputs, y_tensor)
loss.backward()
optimizer.step()

Expected Behavior

Model should train normally and complete the forward pass.

Actual Behavior

Process crashes with segmentation fault at memory address 0x0000000000000008 during first forward pass or backward pass.


What WORKS

Successful Tests

  1. PyTorch basic operations:

    import torch
    x = torch.randn(100, 67)
    y = torch.nn.Linear(67, 128)(x)  # Works fine
  2. Small synthetic datasets:

    X = torch.randn(100, 67)
    y = torch.randint(0, 2, (100,))
    # Training works with small data
  3. CPU device explicitly:

    device = torch.device('cpu')
    model = model.to(device)
    # Still crashes with real 1757-sample data
  4. Reduced batch sizes:

    • Tried batch_size=64, 32, 16
    • Still crashes on full dataset

What FAILS

  1. Full real dataset (1757 samples, 67 features)
  2. Any configuration with real pandas-processed data
  3. Both with and without DataLoader
  4. Crashes even when NN runs FIRST (not state pollution)
  5. File-based logging (removed all print statements)

🔍 Debugging Attempts

1. Environment Isolation

# Created clean environment
python3 -m venv test_env
source test_env/bin/activate
pip install torch numpy pandas scikit-learn

# Still crashes

2. Order Independence Test

Ran Neural Network BEFORE other models (Random Forest, XGBoost) to rule out state pollution:

# Test: Train NN first, then RF, then XGB
# Result: STILL CRASHES at NN step
# Conclusion: Not caused by other libraries

3. Memory Profiling

import tracemalloc
tracemalloc.start()

# Train NN here - CRASHES before memory snapshot

4. Device Testing

# Tried explicit CPU device
device = torch.device('cpu')
model.to(device)
X_tensor = X_tensor.to(device)
y_tensor = y_tensor.to(device)

# STILL CRASHES

5. Reduced Complexity

# Tried simplest possible network
model = nn.Sequential(nn.Linear(67, 2))

# STILL CRASHES on 1757-sample dataset

📊 Data Characteristics

Dataset Details

  • Shape: (1757, 67)
  • Features: Financial time series data
    • Price-based features (15)
    • Momentum indicators (12)
    • Volume features (10)
    • Volatility metrics (8)
    • Pattern features (10)
    • Market regime indicators (12)
  • Labels: Binary classification (UP/DOWN)
  • Data Type: float32
  • Preprocessing: StandardScaler normalization
  • Source: Real stock market data via yfinance

Data Preparation Pipeline

from sklearn.preprocessing import StandardScaler
import pandas as pd

# Load from cache (SQLite)
data = pd.concat([ticker_data for ticker in ['SPY', 'QQQ', ...]])

# Feature engineering
features = engineer.add_all_features(data)
features = detector.add_regime_features(features)

# Prepare for ML
X = features.drop(['target', 'symbol'], axis=1)
y = features['target']

# Scale
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Convert to tensors - CRASHES HERE
X_tensor = torch.FloatTensor(X_scaled)

🧵 Threading Analysis

The crash occurs in OpenMP threading:

libomp.dylib: void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*)

Hypothesis

PyTorch's OpenMP implementation may have thread synchronization issues on:

  • M2 Apple Silicon ARM architecture
  • Python 3.14.0 (released October 2024)
  • Specific data shapes/patterns

Evidence

  • Crash is in thread barrier synchronization
  • NULL pointer dereference at 0x0000000000000008
  • Occurs during first epoch initialization
  • Multiple threads involved (Threads 0 and 1 both in barrier wait)

🔄 Workarounds Attempted

  1. ❌ Reduce batch size (64 → 32 → 16)
  2. ❌ Disable multi-threading: torch.set_num_threads(1)
  3. ❌ Use CPU explicitly
  4. ❌ Simplify network architecture
  5. ❌ Remove DataLoader (direct tensors)
  6. ❌ Reduce epochs (100 → 50 → 10)
  7. ❌ Different optimizers (Adam, SGD)
  8. ❌ Different loss functions
  9. ❌ Change random seeds
  10. ❌ Fresh Python environment

NONE of these workarounds resolved the crash.


📝 Additional Notes

Other ML Libraries Work Fine

  • scikit-learn Random Forest: ✅ Works perfectly (74.7% accuracy)
  • XGBoost: ✅ Works perfectly (80.97% accuracy)
  • pandas, numpy: ✅ All operations normal
  • Data pipeline: ✅ Generates 1757 samples without issues

Standalone PyTorch Works

Small synthetic test cases work fine. Only fails with:

  • Real production data shape (1757, 67)
  • Data processed through pandas pipeline
  • Full training loop

🎯 Impact

Severity: CRITICAL

This is a segmentation fault (process crash) that makes PyTorch unusable for production machine learning workloads on M2 Macs with Python 3.14.

Affected Users

  • M2/M3 Mac users with Python 3.14
  • Financial ML practitioners
  • Anyone training on real-world datasets with similar shapes
  • Production ML systems requiring stability

Current Workaround

Cannot use PyTorch Neural Networks - must rely on scikit-learn and XGBoost models instead.


🔧 Requested Investigation

  1. Memory access patterns in OpenMP thread barriers on ARM M2
  2. Python 3.14 compatibility with PyTorch 2.11.0
  3. Tensor initialization with specific data shapes (1757, 67)
  4. Thread synchronization in forward/backward pass
  5. NULL pointer dereference at address 0x0000000000000008

📎 Full Crash Report Available

Full macOS crash report with stack trace, loaded libraries, and memory state available upon request.


🏷️ Labels

  • crash
  • segfault
  • m2-mac
  • apple-silicon
  • python-3.14
  • openmp
  • threading
  • arm64
  • critical

✉️ Contact

Available to provide:

  • Full crash dump
  • Dataset sample (anonymized)
  • Additional test cases
  • Remote debugging session

Thank you for investigating this critical issue!

Versions

Last login: Sun May 10 13:58:08 on ttys000 stellaanguiano@Esthelas-MacBook-Pro ~ % curl -sL https://raw.githubusercontent.com/pytorch/pytorch/main/torch/utils/collect_env.py | python zsh: command not found: python stellaanguiano@Esthelas-MacBook-Pro ~ % curl -sL https://raw.githubusercontent.com/pytorch/pytorch/main/torch/utils/collect_env.py | python3 Collecting environment information... PyTorch version: 2.11.0 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: macOS 26.4 (arm64) GCC version: Could not collect Clang version: 21.0.0 (clang-2100.0.123.102) CMake version: Could not collect Libc version: N/A

Python version: 3.14.0 (main, Oct 7 2025, 09:34:52) [Clang 17.0.0 (clang-1700.0.13.3)] (64-bit runtime) Python platform: macOS-26.4-arm64-arm-64bit-Mach-O Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA Is XPU available: False HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Caching allocator config: N/A

CPU: Apple M2 Pro

Versions of relevant libraries: [pip3] numpy==2.4.4 [pip3] torch==2.11.0 [pip3] torchvision==0.26.0 [conda] Could not collect stellaanguiano@Esthelas-MacBook-Pro ~ %

cc @malfet @aditvenk @snadampal @milpuz01 @aditew01 @nikhil-arm @fadara01 @robert-hardwick @nWEIdia @kulinseth @DenisVieriu97 @jhavukainen

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING