pytorch - 💡(How to fix) Fix torch.save silently truncates files at INT_MAX bytes when the output path contains non-ASCII characters

Fix Action

Fix / Workaround

Workaround

We have also seen this bug on Linux / GPFS with PyTorch 2.10.0+cu128 (different truncation byte count: 2,147,481,129; same silent-failure mode). Independent reproduction on Linux ext4/tmpfs would be useful to confirm whether it is a general issue or Linux+GPFS specific.
Wrapping torch.save with a post-write zipfile.is_zipfile check is a cheap way to surface this failure immediately rather than at load time, if a similar workaround is needed downstream.
This issue was prepared with AI assistance (Claude Code). I have verified every claim against fresh runs on the nightly listed below.

Code Example

import os, tempfile, zipfile, torch

base = tempfile.mkdtemp(prefix="torch_unicode_repro_")
for label, dirname in [
    ("ascii",    "ascii_dir"),
    ("chinese",  "测试目录"),
    ("cyrillic", "тест_каталог"),
    ("emoji",    "dir_\U0001F4A9"),
]:
    out_dir = os.path.join(base, dirname)
    os.makedirs(out_dir, exist_ok=True)
    p = os.path.join(out_dir, "big.pt")

    t = torch.zeros(200, 2048, 2048, dtype=torch.float32)  # ~3.13 GiB storage
    expected = t.numel() * t.element_size()

    torch.save(t, p)
    actual = os.path.getsize(p)
    print(f"{label:<10} expected={expected:>14,}  actual={actual:>14,}  "
          f"diff={expected - actual:>13,}  zip={zipfile.is_zipfile(p)}")
    del t

---

ascii      expected= 3,355,443,200  actual= 3,355,444,685  diff=       -1,485  zip=True
chinese    expected= 3,355,443,200  actual= 2,147,485,224  diff= 1,207,957,976  zip=True
cyrillic   expected= 3,355,443,200  actual= 2,147,485,224  diff= 1,207,957,976  zip=True
emoji      expected= 3,355,443,200  actual= 2,147,485,224  diff= 1,207,957,976  zip=True

---

RuntimeError: PytorchStreamReader failed reading zip archive:
not a ZIP archive. This is an internal miniz error.

---

with open(path, "wb") as f:
    torch.save(obj, f)

---

PyTorch version: 2.13.0.dev20260524+cu130
Is debug build: False
CUDA used to build PyTorch: 13.0
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Pro (10.0.26200 64-bit)
Python version: 3.13.5 | packaged by Anaconda, Inc. | (main, Jun 12 2025, 16:37:03) [MSC v.1929 64 bit (AMD64)]
Python platform: Windows-11-10.0.26200-SP0
Is CUDA available: True
CUDA runtime version: 12.8.61
GPU: NVIDIA GeForce RTX 4090 (x2)
Nvidia driver version: 595.79
System ACP: 65001 (UTF-8)

[pip3] numpy==2.1.3
[pip3] torch==2.13.0.dev20260524+cu130
[pip3] torchvision==0.28.0.dev20260524+cu130

Title

torch.save silently truncates files at INT_MAX bytes when the output path contains non-ASCII characters

🐛 Describe the bug

When torch.save(obj, path) is called with a path string that contains any non-ASCII character (Chinese, Cyrillic, Japanese, emoji), and the serialized tensor is larger than 2 GiB, the output file is silently truncated to exactly 2,147,485,224 bytes (= INT_MAX + ~1,577 bytes of zip header overhead). No exception is raised, the Python process exits cleanly with code 0, and the bug only surfaces later when torch.load reports the file is not a valid zip archive. The same code with an ASCII-only path works correctly.

Passing a file handle (torch.save(obj, open(path, "wb"))) instead of a path string completely avoids the bug on all tested charsets and sizes.

Minimal reproduction

import os, tempfile, zipfile, torch

base = tempfile.mkdtemp(prefix="torch_unicode_repro_")
for label, dirname in [
    ("ascii",    "ascii_dir"),
    ("chinese",  "测试目录"),
    ("cyrillic", "тест_каталог"),
    ("emoji",    "dir_\U0001F4A9"),
]:
    out_dir = os.path.join(base, dirname)
    os.makedirs(out_dir, exist_ok=True)
    p = os.path.join(out_dir, "big.pt")

    t = torch.zeros(200, 2048, 2048, dtype=torch.float32)  # ~3.13 GiB storage
    expected = t.numel() * t.element_size()

    torch.save(t, p)
    actual = os.path.getsize(p)
    print(f"{label:<10} expected={expected:>14,}  actual={actual:>14,}  "
          f"diff={expected - actual:>13,}  zip={zipfile.is_zipfile(p)}")
    del t

Observed output

ascii      expected= 3,355,443,200  actual= 3,355,444,685  diff=       -1,485  zip=True
chinese    expected= 3,355,443,200  actual= 2,147,485,224  diff= 1,207,957,976  zip=True
cyrillic   expected= 3,355,443,200  actual= 2,147,485,224  diff= 1,207,957,976  zip=True
emoji      expected= 3,355,443,200  actual= 2,147,485,224  diff= 1,207,957,976  zip=True

Calling torch.load on any of the non-ascii files then raises:

RuntimeError: PytorchStreamReader failed reading zip archive:
not a ZIP archive. This is an internal miniz error.

Behaviour matrix tested

charset	size	path arg	file size (bytes)	round-trip
ascii	2.1 GiB	str path	2,252,001,485	OK
ascii	3.1 GiB	str path	3,355,444,685	OK
chinese	2.1 GiB	str path	2,147,485,224	FAIL
chinese	3.1 GiB	str path	2,147,485,224	FAIL
cyrillic	2.1 GiB	str path	2,147,485,224	FAIL
cyrillic	3.1 GiB	str path	2,147,485,224	FAIL
japanese	2.1 GiB	str path	2,147,485,224	FAIL
japanese	3.1 GiB	str path	2,147,485,224	FAIL
emoji	2.1 GiB	str path	2,147,485,224	FAIL
emoji	3.1 GiB	str path	2,147,485,224	FAIL
chinese	3.1 GiB	file handle	3,355,444,777	OK
emoji	3.1 GiB	file handle	3,355,444,777	OK

The truncation byte count is identical (2,147,485,224) regardless of tensor size or which non-ASCII charset is in the path. Passing a file handle always works.

System ACP on the reporting machine is set to 65001 (UTF-8) via the Windows "Beta: Use Unicode UTF-8 for worldwide language support" option, so unicode paths do resolve correctly (the truncated file ends up at the expected unicode path, and 2 GiB of data are written there before truncation).

Expected behaviour

torch.save(obj, path) should produce a complete, loadable file regardless of the characters in path, or raise an exception if it cannot.

Workaround

with open(path, "wb") as f:
    torch.save(obj, f)

This works on every charset and size we tested.

Additional notes

We have also seen this bug on Linux / GPFS with PyTorch 2.10.0+cu128 (different truncation byte count: 2,147,481,129; same silent-failure mode). Independent reproduction on Linux ext4/tmpfs would be useful to confirm whether it is a general issue or Linux+GPFS specific.
Wrapping torch.save with a post-write zipfile.is_zipfile check is a cheap way to surface this failure immediately rather than at load time, if a similar workaround is needed downstream.
This issue was prepared with AI assistance (Claude Code). I have verified every claim against fresh runs on the nightly listed below.

Versions

PyTorch version: 2.13.0.dev20260524+cu130
Is debug build: False
CUDA used to build PyTorch: 13.0
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Pro (10.0.26200 64-bit)
Python version: 3.13.5 | packaged by Anaconda, Inc. | (main, Jun 12 2025, 16:37:03) [MSC v.1929 64 bit (AMD64)]
Python platform: Windows-11-10.0.26200-SP0
Is CUDA available: True
CUDA runtime version: 12.8.61
GPU: NVIDIA GeForce RTX 4090 (x2)
Nvidia driver version: 595.79
System ACP: 65001 (UTF-8)

[pip3] numpy==2.1.3
[pip3] torch==2.13.0.dev20260524+cu130
[pip3] torchvision==0.28.0.dev20260524+cu130

Originally observed on PyTorch 2.10.0+cu128 on both Windows 11/NTFS and Linux HPC/GPFS; re-reproduced today on the nightly above.

cc @peterjc123 @mszhanyi @skyline75489 @nbcsm @iremyux @Blackhex @nkhasbag-nv @mruberry @mikaylagawarecki

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix torch.save silently truncates files at INT_MAX bytes when the output path contains non-ASCII characters

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

Workaround

Code Example

Title

🐛 Describe the bug

Minimal reproduction

Observed output

Behaviour matrix tested

Expected behaviour

Workaround

Additional notes

Versions

Still need to ship something?

TRENDING