transformers - ✅(Solved) Fix `trust_remote_code` cache path collides for local models sharing a leaf directory name [1 pull requests, 1 comments, 2 participants]

nurpax · 2026-04-24T14:04:02Z

[transformers] PR 45642: Fix trust remote code local cache collisions for local models 45632 - Repository: huggingface/transformers - Author: Jeevang1-epic - S… # PR #45642: Fix trust_remote_code local cache collisions for local models (#45632) - Repository: huggingface/transformers - Author: Jeevang1-epic - State: open | merged: False - Link: https://github.com/huggingface/transformers/pull/45642 ## Description (problem / solution / changelog) Fixes #45632 ## Summary This PR fixes local `trust_remote_code` cache collisions when different local model paths share the same leaf directory name. ## What changed - Updated `get_cached_module_file` in `dynamic_module_utils.py`: - local cache subdirectory is now keyed by a stable SHA-256 hash of local source file bytes - hash includes main module file + direct relative-import source files - local/remote branch handling now uses explicit `is_local` logic (not basename equality) - Added regression tests in `tests/utils/test_dynamic_module_utils.py` for: - same leaf dir + different source => different cache dirs - different paths + identical source => same cache dir - same main file + different relative-import source => different cache dirs ## Validation - `python -m compileall src/transformers/dynamic_module_utils.py tests/utils/test_dynamic_module_utils.py` - Added regression tests above. - Manually validated the 3 collision/dedup scenarios via direct `get_cached_module_file` checks. ## Coordination / duplicate-work check - Coordinated on issue: https://github.com/huggingface/transformers/issues/45632 - No overlapping open PR found for this issue at submission time. ## Notes - Documentation update: not needed (bug fix only). - AI was used to assist drafting; I reviewed the patch and validation results before submission. ## Changed files - `src/transformers/dynamic_module_utils.py` (modified, +44/-4) - `tests/utils/test_dynamic_module_utils.py` (modified, +53/-1) ## Fixed - Fixed by PR: Fix trust_remote_code local cache collisions for local models (#45632) (https://github.com/huggingface/transformers/pull/45642) ### System Info - transformers: 5.5.3 - huggingface_hub: 1.12.0 - Python: 3.13 - OS: Linux ### Who can help? _No response_ ### Information - [ ] The official example scripts - [x] My own modified scripts ### Tasks - [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below) ### Reproduction Source files: - [custom_model.py](https://github.com/user-attachments/files/27054919/custom_model.py) - [main.py](https://github.com/user-attachments/files/27054920/main.py) Save two models with different source, then load each one: ``` $ python main.py save --path=pretrained_a/subdir --magic="Magic A" $ python main.py save --path=pretrained_b/subdir --magic="Magic B" $ python main.py load --path=pretrained_a/subdir Load "pretrained_a/subdir" Model says: Magic A Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py Source says: Magic A $ python main.py load --path=pretrained_b/subdir Load "pretrained_b/subdir" Model says: Magic B Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py Source says: Magic B ``` Both models end up cached at the same path in `HF_MODULES_CACHE`, even though their source differs. ### Expected behavior Two models with different source on disk should get separate cache entries, or not be cached at all. ### Actual The cache subdirectory is named after the basename of the local path (`subdir`), so the two models share a cache location and overwrite each other. The sequential case above happens to produce correct output only because each load rewrites the cached file before importing it. ### Consequences Breaks on parallel environments such as on Slurm clusters were multiple jobs try to use the same cache dirs. 1. Parallel loads race on the shared file. Two processes loading these models at the same time will write to the same path with no coordination, and the imported module can end up with arbitrary contents. "Don't load in parallel" is not a workable answer: `HF_MODULES_CACHE` is a shared directory used by other transformers code, and there are legitimate cases where multiple processes need to load different `trust_remote_code` models concurrently. 2. The cache grows without need. The source already exists on local disk - it could be loaded directly. ### Suggested fix Key the local-path cache subdirectory by a content hash of the source file(s), computed at the point the bytes are being read. - Different source produces different cache dirs, so parallel loads of distinct models do not collide. - Identical source is populated once, regardless of how many local paths reference it.

transformers2026-04-24 14:04:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#45632•Fetched 2026-04-25 06:03:06

View on GitHub

Comments

Participants

Timeline

Reactions

Author

nurpax

Participants

Jeevang1-epic

nurpax

Timeline (top)

commented ×1cross-referenced ×1labeled ×1

Root Cause

The cache subdirectory is named after the basename of the local path (subdir), so the two models share a cache location and overwrite each other. The sequential case above happens to produce correct output only because each load rewrites the cached file before importing it.

Fix Action

Fixed

Fixed by PR: Fix trust_remote_code local cache collisions for local models (#45632) (https://github.com/huggingface/transformers/pull/45642)

PR fix notes

PR #45642: Fix trust_remote_code local cache collisions for local models (#45632)

Repository: huggingface/transformers
Author: Jeevang1-epic
State: open | merged: False
Link: https://github.com/huggingface/transformers/pull/45642

Description (problem / solution / changelog)

Fixes #45632

Summary

This PR fixes local trust_remote_code cache collisions when different local model paths share the same leaf directory name.

What changed

Updated get_cached_module_file in dynamic_module_utils.py:
- local cache subdirectory is now keyed by a stable SHA-256 hash of local source file bytes
- hash includes main module file + direct relative-import source files
- local/remote branch handling now uses explicit is_local logic (not basename equality)
Added regression tests in tests/utils/test_dynamic_module_utils.py for:
- same leaf dir + different source => different cache dirs
- different paths + identical source => same cache dir
- same main file + different relative-import source => different cache dirs

Validation

python -m compileall src/transformers/dynamic_module_utils.py tests/utils/test_dynamic_module_utils.py
Added regression tests above.
Manually validated the 3 collision/dedup scenarios via direct get_cached_module_file checks.

Coordination / duplicate-work check

Coordinated on issue: https://github.com/huggingface/transformers/issues/45632
No overlapping open PR found for this issue at submission time.

Notes

Documentation update: not needed (bug fix only).
AI was used to assist drafting; I reviewed the patch and validation results before submission.

Changed files

src/transformers/dynamic_module_utils.py (modified, +44/-4)
tests/utils/test_dynamic_module_utils.py (modified, +53/-1)

Code Example

$ python main.py save --path=pretrained_a/subdir --magic="Magic A"
$ python main.py save --path=pretrained_b/subdir --magic="Magic B"

$ python main.py load --path=pretrained_a/subdir
Load "pretrained_a/subdir"
Model says:  Magic A
Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py
Source says: Magic A

$ python main.py load --path=pretrained_b/subdir
Load "pretrained_b/subdir"
Model says:  Magic B
Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py
Source says: Magic B

RAW_BUFFERClick to expand / collapse

System Info

transformers: 5.5.3
huggingface_hub: 1.12.0
Python: 3.13
OS: Linux

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Source files:

Save two models with different source, then load each one:

$ python main.py save --path=pretrained_a/subdir --magic="Magic A"
$ python main.py save --path=pretrained_b/subdir --magic="Magic B"

$ python main.py load --path=pretrained_a/subdir
Load "pretrained_a/subdir"
Model says:  Magic A
Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py
Source says: Magic A

$ python main.py load --path=pretrained_b/subdir
Load "pretrained_b/subdir"
Model says:  Magic B
Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py
Source says: Magic B

Both models end up cached at the same path in HF_MODULES_CACHE, even though their source differs.

Expected behavior

Two models with different source on disk should get separate cache entries, or not be cached at all.

Actual

Consequences

Breaks on parallel environments such as on Slurm clusters were multiple jobs try to use the same cache dirs.

Parallel loads race on the shared file. Two processes loading these models at the same time will write to the same path with no coordination, and the imported module can end up with arbitrary contents. "Don't load in parallel" is not a workable answer: HF_MODULES_CACHE is a shared directory used by other transformers code, and there are legitimate cases where multiple processes need to load different trust_remote_code models concurrently.
The cache grows without need. The source already exists on local disk - it could be loaded directly.

Suggested fix

Key the local-path cache subdirectory by a content hash of the source file(s), computed at the point the bytes are being read.

Different source produces different cache dirs, so parallel loads of distinct models do not collide.
Identical source is populated once, regardless of how many local paths reference it.

extent analysis

TL;DR

Modify the caching mechanism to use a content hash of the source file(s) as the cache subdirectory key to prevent collisions.

Guidance

Investigate the huggingface_hub and transformers libraries to determine where the caching logic can be modified to incorporate a content hash.
Consider using a hash function like md5 or sha256 to generate a unique cache subdirectory name based on the source file contents.
Review the HF_MODULES_CACHE directory structure to ensure it can accommodate the new caching strategy without conflicts.
Test the modified caching mechanism with parallel loads to verify that collisions are resolved.

Example

import hashlib

def get_cache_subdir(source_file):
    # Compute content hash of source file
    with open(source_file, 'rb') as f:
        content_hash = hashlib.md5(f.read()).hexdigest()
    return content_hash

# Use the content hash as the cache subdirectory name
cache_subdir = get_cache_subdir('custom_model.py')

Notes

The suggested fix requires modifying the caching logic, which may involve digging into the huggingface_hub and transformers libraries. The example provided is a simplified illustration of how a content hash can be generated and used as a cache subdirectory name.

Recommendation

Apply the workaround by modifying the caching mechanism to use a content hash of the source file(s) as the cache subdirectory key, as this approach addresses the collision issue and allows for efficient caching of distinct models.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Two models with different source on disk should get separate cache entries, or not be cached at all.

#ISR setup #authentication setup #request error #file not found #serialization error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

transformers - ✅(Solved) Fix `trust_remote_code` cache path collides for local models sharing a leaf directory name [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #45642: Fix trust_remote_code local cache collisions for local models (#45632)

Description (problem / solution / changelog)

Summary

What changed

Validation

Coordination / duplicate-work check

Notes

Changed files

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Actual

Consequences

Suggested fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING