pytorch - 💡(How to fix) Fix PyTorch Segmentation Fault on M2 Mac - Bug Report

pytorch2026-05-11 04:42:54

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000008 Exception Codes: 0x0000000000000001, 0x0000000000000008 Termination Reason: Namespace SIGNAL, Code 11, Segmentation fault: 11

Root Cause

Root Cause: Memory access violation in OpenMP threading library (libomp.dylib) at NULL pointer address 0x0000000000000008.

Fix Action

Fix / Workaround

🔄 Workarounds Attempted

NONE of these workarounds resolved the crash.

Current Workaround

Cannot use PyTorch Neural Networks - must rely on scikit-learn and XGBoost models instead.

Code Example

torch==2.11.0
numpy==2.0+
pandas==2.0+
scikit-learn==1.3+

---

Exception Type:    EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000008
Exception Codes:   0x0000000000000001, 0x0000000000000008
Termination Reason: Namespace SIGNAL, Code 11, Segmentation fault: 11

---

Thread 0 Crashed:
0   libomp.dylib  void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 44
1   libomp.dylib  kmp_flag_64<false, true>::wait(kmp_info*, int, void*) + 1896
2   libomp.dylib  __kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) + 172
3   libomp.dylib  __kmp_fork_barrier(int, int) + 464
4   libomp.dylib  __kmp_launch_thread + 336
5   libomp.dylib  __kmp_launch_worker(void*) + 280

---

import torch
import torch.nn as nn
import numpy as np

# Generate data matching production shape
np.random.seed(42)
X_train = np.random.randn(1757, 67).astype(np.float32)
y_train = np.random.randint(0, 2, 1757)

# Convert to PyTorch tensors
X_tensor = torch.FloatTensor(X_train)
y_tensor = torch.LongTensor(y_train)

# Simple feedforward network
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(67, 128),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(32, 2)
        )
    
    def forward(self, x):
        return self.network(x)

# Training setup
model = SimpleNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# CRASH OCCURS HERE
model.train()
outputs = model(X_tensor)  # <-- SEGFAULT
loss = criterion(outputs, y_tensor)
loss.backward()
optimizer.step()

---

import torch
   x = torch.randn(100, 67)
   y = torch.nn.Linear(67, 128)(x)  # Works fine

---

X = torch.randn(100, 67)
   y = torch.randint(0, 2, (100,))
   # Training works with small data

---

device = torch.device('cpu')
   model = model.to(device)
   # Still crashes with real 1757-sample data

---

# Created clean environment
python3 -m venv test_env
source test_env/bin/activate
pip install torch numpy pandas scikit-learn

# Still crashes

---

# Test: Train NN first, then RF, then XGB
# Result: STILL CRASHES at NN step
# Conclusion: Not caused by other libraries

---

import tracemalloc
tracemalloc.start()

# Train NN here - CRASHES before memory snapshot

---

# Tried explicit CPU device
device = torch.device('cpu')
model.to(device)
X_tensor = X_tensor.to(device)
y_tensor = y_tensor.to(device)

# STILL CRASHES

---

# Tried simplest possible network
model = nn.Sequential(nn.Linear(67, 2))

# STILL CRASHES on 1757-sample dataset

---

from sklearn.preprocessing import StandardScaler
import pandas as pd

# Load from cache (SQLite)
data = pd.concat([ticker_data for ticker in ['SPY', 'QQQ', ...]])

# Feature engineering
features = engineer.add_all_features(data)
features = detector.add_regime_features(features)

# Prepare for ML
X = features.drop(['target', 'symbol'], axis=1)
y = features['target']

# Scale
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Convert to tensors - CRASHES HERE
X_tensor = torch.FloatTensor(X_scaled)

---

libomp.dylib: void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*)

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

PYTORCH_BUG_REPORT.md

PyTorch Segmentation Fault on M2 Mac - Bug Report

Date: May 10, 2026
Reporter: Trading System Developer
Severity: Critical - Process Crash (SIGSEGV)

🐛 Bug Summary

PyTorch causes segmentation fault (SIGSEGV) when training a simple feedforward neural network on M2 MacBook Pro with Python 3.14.0. The crash occurs during the first training epoch with real financial data (1757 samples, 67 features).

💻 Environment

Hardware

Model: MacBook Pro M2 (Mac14,10)
Chip: Apple M2 Pro (12 cores: 8 performance, 4 efficiency)
RAM: 16GB Unified Memory
Architecture: ARM-64 (Native)

Software

OS: macOS 26.4 (25E246)
Python: 3.14.0 (installed via Homebrew)
PyTorch: 2.11.0
Installation: pip install torch --break-system-packages

Dependencies

torch==2.11.0
numpy==2.0+
pandas==2.0+
scikit-learn==1.3+

🔥 Error Details

Crash Type

Exception Type:    EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000008
Exception Codes:   0x0000000000000001, 0x0000000000000008
Termination Reason: Namespace SIGNAL, Code 11, Segmentation fault: 11

Crash Location

Thread 0 Crashed:
0   libomp.dylib  void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 44
1   libomp.dylib  kmp_flag_64<false, true>::wait(kmp_info*, int, void*) + 1896
2   libomp.dylib  __kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) + 172
3   libomp.dylib  __kmp_fork_barrier(int, int) + 464
4   libomp.dylib  __kmp_launch_thread + 336
5   libomp.dylib  __kmp_launch_worker(void*) + 280

Root Cause: Memory access violation in OpenMP threading library (libomp.dylib) at NULL pointer address 0x0000000000000008.

🧪 Reproducible Test Case

Minimal Code

import torch
import torch.nn as nn
import numpy as np

# Generate data matching production shape
np.random.seed(42)
X_train = np.random.randn(1757, 67).astype(np.float32)
y_train = np.random.randint(0, 2, 1757)

# Convert to PyTorch tensors
X_tensor = torch.FloatTensor(X_train)
y_tensor = torch.LongTensor(y_train)

# Simple feedforward network
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(67, 128),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(32, 2)
        )
    
    def forward(self, x):
        return self.network(x)

# Training setup
model = SimpleNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# CRASH OCCURS HERE
model.train()
outputs = model(X_tensor)  # <-- SEGFAULT
loss = criterion(outputs, y_tensor)
loss.backward()
optimizer.step()

Expected Behavior

Model should train normally and complete the forward pass.

Actual Behavior

Process crashes with segmentation fault at memory address 0x0000000000000008 during first forward pass or backward pass.

✅ What WORKS

Successful Tests

✅ PyTorch basic operations:

import torch
x = torch.randn(100, 67)
y = torch.nn.Linear(67, 128)(x)  # Works fine

✅ Small synthetic datasets:

X = torch.randn(100, 67)
y = torch.randint(0, 2, (100,))
# Training works with small data

✅ CPU device explicitly:

device = torch.device('cpu')
model = model.to(device)
# Still crashes with real 1757-sample data

✅ Reduced batch sizes:
- Tried batch_size=64, 32, 16
- Still crashes on full dataset

❌ What FAILS

❌ Full real dataset (1757 samples, 67 features)
❌ Any configuration with real pandas-processed data
❌ Both with and without DataLoader
❌ Crashes even when NN runs FIRST (not state pollution)
❌ File-based logging (removed all print statements)

🔍 Debugging Attempts

1. Environment Isolation

# Created clean environment
python3 -m venv test_env
source test_env/bin/activate
pip install torch numpy pandas scikit-learn

# Still crashes

2. Order Independence Test

Ran Neural Network BEFORE other models (Random Forest, XGBoost) to rule out state pollution:

# Test: Train NN first, then RF, then XGB
# Result: STILL CRASHES at NN step
# Conclusion: Not caused by other libraries

3. Memory Profiling

import tracemalloc
tracemalloc.start()

# Train NN here - CRASHES before memory snapshot

4. Device Testing

# Tried explicit CPU device
device = torch.device('cpu')
model.to(device)
X_tensor = X_tensor.to(device)
y_tensor = y_tensor.to(device)

# STILL CRASHES

5. Reduced Complexity

# Tried simplest possible network
model = nn.Sequential(nn.Linear(67, 2))

# STILL CRASHES on 1757-sample dataset

📊 Data Characteristics

Dataset Details

Shape: (1757, 67)
Features: Financial time series data
- Price-based features (15)
- Momentum indicators (12)
- Volume features (10)
- Volatility metrics (8)
- Pattern features (10)
- Market regime indicators (12)
Labels: Binary classification (UP/DOWN)
Data Type: float32
Preprocessing: StandardScaler normalization
Source: Real stock market data via yfinance

Data Preparation Pipeline

from sklearn.preprocessing import StandardScaler
import pandas as pd

# Load from cache (SQLite)
data = pd.concat([ticker_data for ticker in ['SPY', 'QQQ', ...]])

# Feature engineering
features = engineer.add_all_features(data)
features = detector.add_regime_features(features)

# Prepare for ML
X = features.drop(['target', 'symbol'], axis=1)
y = features['target']

# Scale
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Convert to tensors - CRASHES HERE
X_tensor = torch.FloatTensor(X_scaled)

🧵 Threading Analysis

The crash occurs in OpenMP threading:

libomp.dylib: void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*)

Hypothesis

PyTorch's OpenMP implementation may have thread synchronization issues on:

M2 Apple Silicon ARM architecture
Python 3.14.0 (released October 2024)
Specific data shapes/patterns

Evidence

Crash is in thread barrier synchronization
NULL pointer dereference at 0x0000000000000008
Occurs during first epoch initialization
Multiple threads involved (Threads 0 and 1 both in barrier wait)

🔄 Workarounds Attempted

❌ Reduce batch size (64 → 32 → 16)
❌ Disable multi-threading: torch.set_num_threads(1)
❌ Use CPU explicitly
❌ Simplify network architecture
❌ Remove DataLoader (direct tensors)
❌ Reduce epochs (100 → 50 → 10)
❌ Different optimizers (Adam, SGD)
❌ Different loss functions
❌ Change random seeds
❌ Fresh Python environment

NONE of these workarounds resolved the crash.

📝 Additional Notes

Other ML Libraries Work Fine

scikit-learn Random Forest: ✅ Works perfectly (74.7% accuracy)
XGBoost: ✅ Works perfectly (80.97% accuracy)
pandas, numpy: ✅ All operations normal
Data pipeline: ✅ Generates 1757 samples without issues

Standalone PyTorch Works

Small synthetic test cases work fine. Only fails with:

Real production data shape (1757, 67)
Data processed through pandas pipeline
Full training loop

🎯 Impact

Severity: CRITICAL

This is a segmentation fault (process crash) that makes PyTorch unusable for production machine learning workloads on M2 Macs with Python 3.14.

Affected Users

M2/M3 Mac users with Python 3.14
Financial ML practitioners
Anyone training on real-world datasets with similar shapes
Production ML systems requiring stability

Current Workaround

Cannot use PyTorch Neural Networks - must rely on scikit-learn and XGBoost models instead.

🔧 Requested Investigation

Memory access patterns in OpenMP thread barriers on ARM M2
Python 3.14 compatibility with PyTorch 2.11.0
Tensor initialization with specific data shapes (1757, 67)
Thread synchronization in forward/backward pass
NULL pointer dereference at address 0x0000000000000008

📎 Full Crash Report Available

Full macOS crash report with stack trace, loaded libraries, and memory state available upon request.

🏷️ Labels

crash
segfault
m2-mac
apple-silicon
python-3.14
openmp
threading
arm64
critical

✉️ Contact

Available to provide:

Full crash dump
Dataset sample (anonymized)
Additional test cases
Remote debugging session

Thank you for investigating this critical issue!

Versions

Last login: Sun May 10 13:58:08 on ttys000 stellaanguiano@Esthelas-MacBook-Pro ~ % curl -sL https://raw.githubusercontent.com/pytorch/pytorch/main/torch/utils/collect_env.py | python zsh: command not found: python stellaanguiano@Esthelas-MacBook-Pro ~ % curl -sL https://raw.githubusercontent.com/pytorch/pytorch/main/torch/utils/collect_env.py | python3 Collecting environment information... PyTorch version: 2.11.0 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: macOS 26.4 (arm64) GCC version: Could not collect Clang version: 21.0.0 (clang-2100.0.123.102) CMake version: Could not collect Libc version: N/A

Python version: 3.14.0 (main, Oct 7 2025, 09:34:52) [Clang 17.0.0 (clang-1700.0.13.3)] (64-bit runtime) Python platform: macOS-26.4-arm64-arm-64bit-Mach-O Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA Is XPU available: False HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Caching allocator config: N/A

CPU: Apple M2 Pro

Versions of relevant libraries: [pip3] numpy==2.4.4 [pip3] torch==2.11.0 [pip3] torchvision==0.26.0 [conda] Could not collect stellaanguiano@Esthelas-MacBook-Pro ~ %

cc @malfet @aditvenk @snadampal @milpuz01 @aditew01 @nikhil-arm @fadara01 @robert-hardwick @nWEIdia @kulinseth @DenisVieriu97 @jhavukainen

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#installation #training loop #serialization error #model compatibility #GPU setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

pytorch - 💡(How to fix) Fix PyTorch Segmentation Fault on M2 Mac - Bug Report

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

🔄 Workarounds Attempted

Current Workaround

Code Example

🐛 Describe the bug

PyTorch Segmentation Fault on M2 Mac - Bug Report

🐛 Bug Summary

💻 Environment

Hardware

Software

Dependencies

🔥 Error Details

Crash Type

Crash Location

🧪 Reproducible Test Case

Minimal Code

Expected Behavior

Actual Behavior

✅ What WORKS

Successful Tests

❌ What FAILS

🔍 Debugging Attempts

1. Environment Isolation

2. Order Independence Test

3. Memory Profiling

4. Device Testing

5. Reduced Complexity

📊 Data Characteristics

Dataset Details

Data Preparation Pipeline

🧵 Threading Analysis

Hypothesis

Evidence

🔄 Workarounds Attempted

📝 Additional Notes

Other ML Libraries Work Fine

Standalone PyTorch Works

🎯 Impact

Severity: CRITICAL

Affected Users

Current Workaround

🔧 Requested Investigation

📎 Full Crash Report Available

🏷️ Labels

✉️ Contact

Versions

Still need to ship something?

RELATED_DISCOVERY

TRENDING