pytorch - 💡(How to fix) Fix Quantized Tensor.set_(Storage, storage_offset, size, stride) lacks storage bounds validation and can create OOB tensor views / segfault [1 comments, 2 participants]

pytorch2026-03-26 07:37:11

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#178487•Fetched 2026-04-08 01:30:13

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Alex0Young

Participants

Alex0Young

ngimel

Timeline (top)

mentioned ×8subscribed ×8closed ×2labeled ×2

Fix Action

Fix / Workaround

The dispatcher entry is defined in aten/src/ATen/native/native_functions.yaml:8223:

func: set_.source_Storage_storage_offset(...) dispatch: CPU: set_storage_cpu_ Meta: set_storage_meta__symint CUDA: set_storage_cuda_ MPS: set_storage_mps_ QuantizedCPU, QuantizedCUDA: set_storage_quantized_

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 42 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel BIOS Vendor ID: QEMU Model name: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz BIOS Model name: pc-i440fx-2.8 CPU @ 2.0GHz BIOS CPU family: 1 CPU family: 6 Model: 85 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 7 BogoMIPS: 6000.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb arat md_clear flush_l1d arch_capabilities Hypervisor vendor: KVM Virtualization type: full L1d cache: 128 KiB (4 instances) L1i cache: 128 KiB (4 instances) L2 cache: 4 MiB (4 instances) L3 cache: 30.3 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-7 Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status Vulnerability Itlb multihit: KVM: Mitigation: VMX unsupported Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT Host state unknown Vulnerability Reg file data sampling: Not affected Vulnerability Retbleed: Mitigation; Enhanced IBRS Vulnerability Spec rstack overflow: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; RSB filling; PBRSB-eIBRS SW sequence; BHI SW loop, KVM SW loop Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Mitigation; TSX disabled

Code Example

qtensor.set_(qtensor, storage_offset=invalid_offset, size=(18,), stride=(1,))

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

The dispatcher entry is defined in aten/src/ATen/native/native_functions.yaml:8223:

func: set_.source_Storage_storage_offset(...) dispatch: CPU: set_storage_cpu_ Meta: set_storage_meta__symint CUDA: set_storage_cuda_ MPS: set_storage_mps_ QuantizedCPU, QuantizedCUDA: set_storage_quantized_

The quantized implementation is in aten/src/ATen/native/quantized/QTensor.cpp:171:

Tensor& set_storage_quantized_( Tensor& self, Storage storage, int64_t storage_offset, IntArrayRef sizes, IntArrayRef strides) { auto* self_ = self.unsafeGetTensorImpl(); self_->set_storage_keep_dtype(std::move(storage)); self_->set_storage_offset(storage_offset); self_->set_sizes_and_strides(sizes, strides); return self; }

By comparison, the other backends validate the requested layout before rebinding storage:

CPU: aten/src/ATen/native/TensorShape.cpp:379
CUDA: aten/src/ATen/native/cuda/TensorShapeCUDA.cpp:31
MPS: aten/src/ATen/native/mps/TensorFactory.cpp:143

Those paths call aten/src/ATen/native/Resize.h:127, which covers:

storage_offset >= 0
size/stride length consistency
storage device compatibility
in-bounds verification via aten/src/ATen/native/Resize.h:83

trigger this bug using the follow example:

qtensor.set_(qtensor, storage_offset=invalid_offset, size=(18,), stride=(1,))

this bug will lead to crash.

Versions

Collecting environment information... PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.2 LTS (x86_64) GCC version: (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Clang version: Could not collect CMake version: version 3.28.3 Libc version: glibc-2.39

Python version: 3.12.3 (main, Jan 8 2026, 11:30:50) [GCC 13.3.0] (64-bit runtime) Python platform: Linux-6.8.0-59-generic-x86_64-with-glibc2.39 Is CUDA available: N/A CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: N/A GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect Is XPU available: N/A HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: N/A Caching allocator config: N/A

Versions of relevant libraries: [pip3] No relevant packages [conda] Could not collect

cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel

extent analysis

Fix Plan

To fix the bug, we need to add validation for the requested layout before rebinding storage in the set_storage_quantized_ function.

Here are the steps:

Check if the storage_offset is non-negative.
Verify that the length of sizes and strides are consistent.
Check if the storage device is compatible.
Perform in-bounds verification.

Example Code

Tensor& set_storage_quantized_(
    Tensor& self,
    Storage storage,
    int64_t storage_offset,
    IntArrayRef sizes,
    IntArrayRef strides) {
  // Check if storage_offset is non-negative
  if (storage_offset < 0) {
    throw std::runtime_error("storage_offset must be non-negative");
  }

  // Verify size/stride length consistency
  if (sizes.size() != strides.size()) {
    throw std::runtime_error("sizes and strides must have the same length");
  }

  // Check storage device compatibility
  if (storage.device() != self.device()) {
    throw std::runtime_error("storage device is not compatible");
  }

  // Perform in-bounds verification
  int64_t total_size = 1;
  for (int64_t size : sizes) {
    total_size *= size;
  }
  if (storage_offset + total_size > storage.size()) {
    throw std::runtime_error("storage offset is out of bounds");
  }

  auto* self_ = self.unsafeGetTensorImpl();
  self_->set_storage_keep_dtype(std::move(storage));
  self_->set_storage_offset(storage_offset);
  self_->set_sizes_and_strides(sizes, strides);
  return self;
}

Verification

To verify that the fix worked, you can test the set_storage_quantized_ function with different inputs, including valid and invalid storage_offset values.

For example:

// Test with valid storage_offset
qtensor.set_(qtensor, storage_offset=0, size=(18,), stride=(1,));

// Test with invalid storage_offset
try {
  qtensor.set_(qtensor, storage_offset=-1, size=(18,), stride=(1,));
} catch (const std::exception& e) {
  // Check if the correct error message is thrown
}

Extra Tips

Always validate user input to prevent crashes and ensure the stability of your program.
Consider adding more comprehensive testing to cover different edge cases and scenarios.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #memory optimization #batch processing #GPU compatibility #latency issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix Quantized Tensor.set_(Storage, storage_offset, size, stride) lacks storage bounds validation and can create OOB tensor views / segfault [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Code Example

🐛 Describe the bug

Versions

extent analysis

Fix Plan

Example Code

Verification

Extra Tips

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix Quantized Tensor.set_(Storage, storage_offset, size, stride) lacks storage bounds validation and can create OOB tensor views / segfault [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Code Example

🐛 Describe the bug

Versions

extent analysis

Fix Plan

Example Code

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING