transformers - ✅(Solved) Fix `trust_remote_code` cache path collides for local models sharing a leaf directory name [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#45632Fetched 2026-04-25 06:03:06
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Author
Timeline (top)
commented ×1cross-referenced ×1labeled ×1

Root Cause

The cache subdirectory is named after the basename of the local path (subdir), so the two models share a cache location and overwrite each other. The sequential case above happens to produce correct output only because each load rewrites the cached file before importing it.

Fix Action

Fixed

PR fix notes

PR #45642: Fix trust_remote_code local cache collisions for local models (#45632)

Description (problem / solution / changelog)

Fixes #45632

Summary

This PR fixes local trust_remote_code cache collisions when different local model paths share the same leaf directory name.

What changed

  • Updated get_cached_module_file in dynamic_module_utils.py:
    • local cache subdirectory is now keyed by a stable SHA-256 hash of local source file bytes
    • hash includes main module file + direct relative-import source files
    • local/remote branch handling now uses explicit is_local logic (not basename equality)
  • Added regression tests in tests/utils/test_dynamic_module_utils.py for:
    • same leaf dir + different source => different cache dirs
    • different paths + identical source => same cache dir
    • same main file + different relative-import source => different cache dirs

Validation

  • python -m compileall src/transformers/dynamic_module_utils.py tests/utils/test_dynamic_module_utils.py
  • Added regression tests above.
  • Manually validated the 3 collision/dedup scenarios via direct get_cached_module_file checks.

Coordination / duplicate-work check

Notes

  • Documentation update: not needed (bug fix only).
  • AI was used to assist drafting; I reviewed the patch and validation results before submission.

Changed files

  • src/transformers/dynamic_module_utils.py (modified, +44/-4)
  • tests/utils/test_dynamic_module_utils.py (modified, +53/-1)

Code Example

$ python main.py save --path=pretrained_a/subdir --magic="Magic A"
$ python main.py save --path=pretrained_b/subdir --magic="Magic B"

$ python main.py load --path=pretrained_a/subdir
Load "pretrained_a/subdir"
Model says:  Magic A
Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py
Source says: Magic A

$ python main.py load --path=pretrained_b/subdir
Load "pretrained_b/subdir"
Model says:  Magic B
Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py
Source says: Magic B
RAW_BUFFERClick to expand / collapse

System Info

  • transformers: 5.5.3
  • huggingface_hub: 1.12.0
  • Python: 3.13
  • OS: Linux

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Source files:

Save two models with different source, then load each one:

$ python main.py save --path=pretrained_a/subdir --magic="Magic A"
$ python main.py save --path=pretrained_b/subdir --magic="Magic B"

$ python main.py load --path=pretrained_a/subdir
Load "pretrained_a/subdir"
Model says:  Magic A
Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py
Source says: Magic A

$ python main.py load --path=pretrained_b/subdir
Load "pretrained_b/subdir"
Model says:  Magic B
Source path: HF_MODULES_CACHE/transformers_modules/subdir/custom_model.py
Source says: Magic B

Both models end up cached at the same path in HF_MODULES_CACHE, even though their source differs.

Expected behavior

Two models with different source on disk should get separate cache entries, or not be cached at all.

Actual

The cache subdirectory is named after the basename of the local path (subdir), so the two models share a cache location and overwrite each other. The sequential case above happens to produce correct output only because each load rewrites the cached file before importing it.

Consequences

Breaks on parallel environments such as on Slurm clusters were multiple jobs try to use the same cache dirs.

  1. Parallel loads race on the shared file. Two processes loading these models at the same time will write to the same path with no coordination, and the imported module can end up with arbitrary contents. "Don't load in parallel" is not a workable answer: HF_MODULES_CACHE is a shared directory used by other transformers code, and there are legitimate cases where multiple processes need to load different trust_remote_code models concurrently.

  2. The cache grows without need. The source already exists on local disk - it could be loaded directly.

Suggested fix

Key the local-path cache subdirectory by a content hash of the source file(s), computed at the point the bytes are being read.

  • Different source produces different cache dirs, so parallel loads of distinct models do not collide.
  • Identical source is populated once, regardless of how many local paths reference it.

extent analysis

TL;DR

Modify the caching mechanism to use a content hash of the source file(s) as the cache subdirectory key to prevent collisions.

Guidance

  • Investigate the huggingface_hub and transformers libraries to determine where the caching logic can be modified to incorporate a content hash.
  • Consider using a hash function like md5 or sha256 to generate a unique cache subdirectory name based on the source file contents.
  • Review the HF_MODULES_CACHE directory structure to ensure it can accommodate the new caching strategy without conflicts.
  • Test the modified caching mechanism with parallel loads to verify that collisions are resolved.

Example

import hashlib

def get_cache_subdir(source_file):
    # Compute content hash of source file
    with open(source_file, 'rb') as f:
        content_hash = hashlib.md5(f.read()).hexdigest()
    return content_hash

# Use the content hash as the cache subdirectory name
cache_subdir = get_cache_subdir('custom_model.py')

Notes

The suggested fix requires modifying the caching logic, which may involve digging into the huggingface_hub and transformers libraries. The example provided is a simplified illustration of how a content hash can be generated and used as a cache subdirectory name.

Recommendation

Apply the workaround by modifying the caching mechanism to use a content hash of the source file(s) as the cache subdirectory key, as this approach addresses the collision issue and allows for efficient caching of distinct models.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Two models with different source on disk should get separate cache entries, or not be cached at all.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - ✅(Solved) Fix `trust_remote_code` cache path collides for local models sharing a leaf directory name [1 pull requests, 1 comments, 2 participants]