pytorch - 💡(How to fix) Fix [JIT] Out-of-bounds read in MemoryReadAdapter::read() via crafted TorchScript model

pytorch2026-05-27 22:40:28

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

MemoryReadAdapter::read() in caffe2/serialize/in_memory_adapter.h performs an unchecked memcpy using pos and n values sourced from ZIP local file header fields. A crafted .pt file with an oversized declared record size causes a heap buffer over-read that crashes the process (SEGV). Tested on 2.7.0 and 2.11.0; both affected.

Root Cause

// caffe2/serialize/in_memory_adapter.h, line ~20
size_t read(uint64_t pos, void* buf, size_t n, const char* what = "") const override {
  (void)what;
  memcpy(buf, (int8_t*)(data_) + pos, n);  // no bounds check
  return n;
}

pos and n come from miniz ZIP extraction (mz_zip_reader_extract_to_mem), which reads them from the ZIP local file header without cross-checking against the adapter's actual buffer size. The PoC declares constants.pkl as 2.2 GB uncompressed inside a 1,843-byte file.

Code Example

// caffe2/serialize/in_memory_adapter.h, line ~20
size_t read(uint64_t pos, void* buf, size_t n, const char* what = "") const override {
  (void)what;
  memcpy(buf, (int8_t*)(data_) + pos, n);  // no bounds check
  return n;
}

---

torch::jit::load()
  → ScriptModuleDeserializer::deserialize()
    → PyTorchStreamReader::getRecord()
      → mz_zip_reader_extract_to_mem()
        → MemoryReadAdapter::read(pos, n)  // pos+n >> size_
          → memcpy OOB read → SEGV

---

UndefinedBehaviorSanitizer: SEGV on unknown address 0x00001ea55000
The signal is caused by a READ memory access.
  #0 __memcpy_avx512_unaligned_erms
  #1 caffe2::serialize::MemoryReadAdapter::read()
  #2 mz_zip_reader_extract_to_mem
  #3 PyTorchStreamReader::getRecord()
  ...
  #9 torch::jit::load()

---

import torch
torch.jit.load("poc-047-memcpy-oob.pt")
# → Segmentation fault

---

size_t read(uint64_t pos, void* buf, size_t n, const char* what = "") const override {
  (void)what;
  if (pos >= static_cast<uint64_t>(size_) ||
      n > static_cast<uint64_t>(size_) - pos) {
    CAFFE_THROW("Read past end of buffer: pos=", pos, " n=", n, " size=", size_);
  }
  memcpy(buf, (int8_t*)(data_) + pos, n);
  return n;
}

RAW_BUFFERClick to expand / collapse

Summary

Root cause

// caffe2/serialize/in_memory_adapter.h, line ~20
size_t read(uint64_t pos, void* buf, size_t n, const char* what = "") const override {
  (void)what;
  memcpy(buf, (int8_t*)(data_) + pos, n);  // no bounds check
  return n;
}

Call chain

torch::jit::load()
  → ScriptModuleDeserializer::deserialize()
    → PyTorchStreamReader::getRecord()
      → mz_zip_reader_extract_to_mem()
        → MemoryReadAdapter::read(pos, n)  // pos+n >> size_
          → memcpy OOB read → SEGV

Crash

UndefinedBehaviorSanitizer: SEGV on unknown address 0x00001ea55000
The signal is caused by a READ memory access.
  #0 __memcpy_avx512_unaligned_erms
  #1 caffe2::serialize::MemoryReadAdapter::read()
  #2 mz_zip_reader_extract_to_mem
  #3 PyTorchStreamReader::getRecord()
  ...
  #9 torch::jit::load()

Reproduction

import torch
torch.jit.load("poc-047-memcpy-oob.pt")
# → Segmentation fault

PoC file available on request (1,843 bytes).

Suggested fix

size_t read(uint64_t pos, void* buf, size_t n, const char* what = "") const override {
  (void)what;
  if (pos >= static_cast<uint64_t>(size_) ||
      n > static_cast<uint64_t>(size_) - pos) {
    CAFFE_THROW("Read past end of buffer: pos=", pos, " n=", n, " size=", size_);
  }
  memcpy(buf, (int8_t*)(data_) + pos, n);
  return n;
}

Environment

PyTorch 2.11.0+cpu, 2.7.0+cpu (both affected)
Linux x86_64
Crash is deterministic

cc @EikanWang @jgong5 @wenzhe-nrv @sanchitintel

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering