pytorch - ✅(Solved) Fix Binaries R2 - s3 out of sync [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

pip 24.0 from /usr/local/lib/python3.11/site-packages/pip (python 3.11)

Looking in indexes: https://download.pytorch.org/whl/cu129 Collecting torch==2.9.0+cu129 Downloading https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (30 kB) ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them. torch==2.9.0+cu129 from https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl#sha256=78005065547658b8c3fd2dd4d68f34625b0543494c947001b5303d253b22da5f: Expected sha256 05df84ccec407908cb70f89d6c2675b8220661f23d7de0cf899f4401f8ab2798 Got 2b0a3a5d37a8d7447e56e7e4e27280f881e805fbae79130fa8874bcfe6eae333

Root Cause

Likely because binary got corrupted when migrated from CloudFront to R2

$ curl -OL https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1196M  100 1196M    0     0   125M      0  0:00:09  0:00:09 --:--:--  103M
$ shasum -a 256 torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl 
aead92a82a01b859a54c933667fe5208598f1c58361b377f1c7e4c898702b341  torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
$ curl -OL https://download.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1196M  100 1196M    0     0  44.7M      0  0:00:26  0:00:26 --:--:-- 32.1M
$ shasum -a 256 torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl 
78005065547658b8c3fd2dd4d68f34625b0543494c947001b5303d253b22da5f  torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl

Fix Action

Fixed

PR fix notes

PR #5485: wheels CI: update to torch 2.10 for 'oldest' configuration

Description (problem / solution / changelog)

Nightly wheel tests on the "oldest" configuration started failing yesterday, like this:

Collecting torch
  Downloading https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (30 kB)
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    torch from https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl#sha256=78005065547658b8c3fd2dd4d68f34625b0543494c947001b5303d253b22da5f:
        Expected sha256 05df84ccec407908cb70f89d6c2675b8220661f23d7de0cf899f4401f8ab2798
             Got        2b0a3a5d37a8d7447e56e7e4e27280f881e805fbae79130fa8874bcfe6eae333

(build link)

This looks like an issue with the metadata on torch's index: https://github.com/pytorch/pytorch/issues/145501#issuecomment-4207316144

For now, this proposes just testing against torch-2.10.0 for the "oldest" configuration, to get CI working again.

Notes for Reviewers

How I tested this

On an earlier commit, ran the full nightly matrix on release/26.04. Saw it pass: https://github.com/rapidsai/cugraph/actions/runs/24143566958/job/70480989274

This change isn't RAPIDS-branch-specific, so I'm confident this will resolve this issue in the nightlies.

Changed files

  • dependencies.yaml (modified, +3/-1)

PR #7947: Add S3/R2 sync check tool and split SHA256 management out of manage_v2.py

Description (problem / solution / changelog)

Summary

  • Adds s3_management/index_tools.py — a new CLI tool for verifying and fixing
    PyTorch wheel integrity across S3 and Cloudflare R2.
  • Moves all SHA256 checksum management commands (--set-checksum,
    --recompute-sha256-pattern, --recompute-missing-sha256) out of
    manage_v2.py into index_tools.py, so that manage_v2.py is focused solely on index HTML generation/upload.
  • manage_v2.py still reports missing SHA256 checksums during index generation.

Motivation

Binaries on S3 and R2 can get out of sync (different SHA256 content hashes for
the same key), causing pip install failures with hash mismatch errors.

Fixes https://github.com/pytorch/pytorch/issues/145501
Fixes https://github.com/pytorch/pytorch/issues/179821

New capabilities in index_tools.py

S3/R2 sync verification (--check-r2-sync):

  • Downloads .whl and .whl.metadata files from both S3 and R2 for a given package/version, computes SHA256 from actual content, and reports mismatches
    or files missing on R2.
  • When version has no +, matches all local-version variants (e.g., 2.9.0
    matches cu129, cu124, cpu, etc.).
  • Correctly excludes child prefixes (e.g., whl/ does not scan whl/nightly/
    or whl/test/).

S3→R2 sync repair (--fix-r2-sync):

  • Same as --check-r2-sync but copies mismatched/missing files from S3 (source of truth) to R2.
  • After copy, purges Cloudflare CDN cache at download-r2.pytorch.org for each fixed file (requires CLOUDFLARE_ZONE_ID and CLOUDFLARE_API_TOKEN
    env vars).

Usage examples

# Check sync status for torch 2.9.0 (all CUDA/CPU variants):
python s3_management/index_tools.py whl --check-r2-sync \                                       
    --package-name torch --package-version 2.9.0
                                                                                                
# Check a specific variant:                                                                     
python s3_management/index_tools.py whl --check-r2-sync \
    --package-name torch --package-version 2.9.0+cu129                                          
                                                      
# Fix mismatches (copy S3→R2 + purge CDN cache):                                                
python s3_management/index_tools.py whl --fix-r2-sync \
    --package-name torch --package-version 2.9.0+cu129                                          
                                                      
# SHA256 checksum commands (moved from manage_v2.py):                                           
python s3_management/index_tools.py whl/test --set-checksum \
    --package-name torch --package-version 2.5.0+cu121                                          
python s3_management/index_tools.py whl/nightly --recompute-missing-sha256                      

Test plan                                                                                       
                                                      
- Run --check-r2-sync for a known-good package/version, confirm all show OK                     
- Run --check-r2-sync for torch 2.9.0+cu129 to confirm it detects the
mismatch from issue #179821                                                                     
- Run --fix-r2-sync for the mismatched package, confirm copy succeeds and                       
CDN cache is purged                                                                             
- Re-run --check-r2-sync after fix to confirm all OK                                            
- Run manage_v2.py whl --do-not-upload to confirm index generation still
works and reports missing checksums                                                             
- Run --recompute-missing-sha256 via index_tools.py to confirm moved                            
SHA256 commands still work

## Changed files

- `s3_management/index_tools.py` (added, +768/-0)
- `s3_management/manage_v2.py` (modified, +9/-343)

Code Example

docker run --rm python:3.11 /bin/bash -c "pip --version && pip download --no-deps --index-url https://download.pytorch.org/whl/cu129 'torch==2.9.0+cu129'"

---

pip 24.0 from /usr/local/lib/python3.11/site-packages/pip (python 3.11)

Looking in indexes: https://download.pytorch.org/whl/cu129
Collecting torch==2.9.0+cu129
  Downloading https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (30 kB)
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    torch==2.9.0+cu129 from https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl#sha256=78005065547658b8c3fd2dd4d68f34625b0543494c947001b5303d253b22da5f:
        Expected sha256 05df84ccec407908cb70f89d6c2675b8220661f23d7de0cf899f4401f8ab2798
             Got        2b0a3a5d37a8d7447e56e7e4e27280f881e805fbae79130fa8874bcfe6eae333

---

$ curl -OL https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1196M  100 1196M    0     0   125M      0  0:00:09  0:00:09 --:--:--  103M
$ shasum -a 256 torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl 
aead92a82a01b859a54c933667fe5208598f1c58361b377f1c7e4c898702b341  torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
$ curl -OL https://download.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1196M  100 1196M    0     0  44.7M      0  0:00:26  0:00:26 --:--:-- 32.1M
$ shasum -a 256 torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl 
78005065547658b8c3fd2dd4d68f34625b0543494c947001b5303d253b22da5f  torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Reporting on behalf of @jameslamb

Running

docker run --rm python:3.11 /bin/bash -c "pip --version && pip download --no-deps --index-url https://download.pytorch.org/whl/cu129 'torch==2.9.0+cu129'"

Fails with:

pip 24.0 from /usr/local/lib/python3.11/site-packages/pip (python 3.11)

Looking in indexes: https://download.pytorch.org/whl/cu129
Collecting torch==2.9.0+cu129
  Downloading https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (30 kB)
ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    torch==2.9.0+cu129 from https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl#sha256=78005065547658b8c3fd2dd4d68f34625b0543494c947001b5303d253b22da5f:
        Expected sha256 05df84ccec407908cb70f89d6c2675b8220661f23d7de0cf899f4401f8ab2798
             Got        2b0a3a5d37a8d7447e56e7e4e27280f881e805fbae79130fa8874bcfe6eae333

Likely because binary got corrupted when migrated from CloudFront to R2

$ curl -OL https://download-r2.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1196M  100 1196M    0     0   125M      0  0:00:09  0:00:09 --:--:--  103M
$ shasum -a 256 torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl 
aead92a82a01b859a54c933667fe5208598f1c58361b377f1c7e4c898702b341  torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
$ curl -OL https://download.pytorch.org/whl/cu129/torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1196M  100 1196M    0     0  44.7M      0  0:00:26  0:00:26 --:--:-- 32.1M
$ shasum -a 256 torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl 
78005065547658b8c3fd2dd4d68f34625b0543494c947001b5303d253b22da5f  torch-2.9.0%2Bcu129-cp311-cp311-manylinux_2_28_x86_64.whl

Versions

¯\(ツ)/¯

cc @ezyang @gchanan @kadeng @msaroufim @seemethere @atalman @tinglvv @nWEIdia

extent analysis

TL;DR

The issue can be resolved by using the original download URL from CloudFront instead of the new R2 URL, as the binary may have been corrupted during migration.

Guidance

  • Try using the original download URL https://download.pytorch.org/whl/cu129 instead of https://download-r2.pytorch.org/whl/cu129 to download the torch package.
  • Verify the integrity of the downloaded package by checking its SHA-256 hash using a tool like shasum.
  • If the issue persists, try downloading the package from the original URL and then use the --no-check option with pip to bypass the hash check.
  • Consider reporting the issue to the PyTorch team to investigate the corruption of the binary during migration.

Example

No code snippet is provided as it's not necessary in this case.

Notes

The solution assumes that the corruption occurred during the migration from CloudFront to R2, and using the original URL may provide a valid package. However, this may not be a permanent solution, and the underlying issue should be investigated and resolved.

Recommendation

Apply workaround: Use the original download URL https://download.pytorch.org/whl/cu129 instead of https://download-r2.pytorch.org/whl/cu129 to download the torch package. This is because the binary may have been corrupted during migration, and using the original URL may provide a valid package.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING