hermes - ✅(Solved) Fix bug: TOCTOU race in _transcribe_local() can trigger duplicate faster-whisper model downloads [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#24767Fetched 2026-05-14 03:51:56
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
labeled ×3commented ×1cross-referenced ×1

tools/transcription_tools.py caches the local Whisper model in _local_model / _local_model_name (lines 104–105). The lazy-init guard in _transcribe_local() is:

_local_model: Optional[object] = None
_local_model_name: Optional[str] = None

def _transcribe_local(file_path: str, model_name: str):
    global _local_model, _local_model_name
    ...
    if _local_model is None or _local_model_name != model_name:
        _local_model = _load_local_whisper_model(model_name)
        _local_model_name = model_name

There is no lock. When two threads call _transcribe_local() concurrently before the first load completes, both pass the is None guard and both call _load_local_whisper_model() — triggering duplicate model downloads (~150 MB on first use) and wasting memory loading the same model twice.

Root Cause

tools/transcription_tools.py caches the local Whisper model in _local_model / _local_model_name (lines 104–105). The lazy-init guard in _transcribe_local() is:

_local_model: Optional[object] = None
_local_model_name: Optional[str] = None

def _transcribe_local(file_path: str, model_name: str):
    global _local_model, _local_model_name
    ...
    if _local_model is None or _local_model_name != model_name:
        _local_model = _load_local_whisper_model(model_name)
        _local_model_name = model_name

There is no lock. When two threads call _transcribe_local() concurrently before the first load completes, both pass the is None guard and both call _load_local_whisper_model() — triggering duplicate model downloads (~150 MB on first use) and wasting memory loading the same model twice.

Fix Action

Fix

Add import threading, _local_model_lock = threading.Lock(), and apply double-checked locking in _transcribe_local().

PR fix notes

PR #24786: fix(transcription): add threading lock to prevent TOCTOU race in _transcribe_local()

Description (problem / solution / changelog)

Summary

Fixes #24767 — TOCTOU race in _transcribe_local() allows duplicate faster-whisper model downloads when concurrent threads pass the unguarded _local_model is None check simultaneously.

Changes

  • tools/transcription_tools.py: Add import threading and a module-level _local_model_lock = threading.Lock(). Apply double-checked locking around the lazy model initialization so only one thread downloads/loads the model while others wait, then all share the cached instance. The same lock protects the CUDA-fallback-to-CPU eviction path.

Root Cause

The module-global _local_model / _local_model_name singleton had no synchronization. Two threads calling _transcribe_local() before the first load completes would both pass the is None guard and both call _load_local_whisper_model(), triggering duplicate ~150 MB downloads and wasting memory.

Testing

  • All 118 existing transcription tests pass (tests/tools/test_transcription_tools.py, tests/tools/test_transcription.py).
  • The double-checked locking pattern avoids lock contention on the hot path (no lock acquired when model is already loaded and name matches).

Changed files

  • tools/transcription_tools.py (modified, +17/-9)

Code Example

_local_model: Optional[object] = None
_local_model_name: Optional[str] = None

def _transcribe_local(file_path: str, model_name: str):
    global _local_model, _local_model_name
    ...
    if _local_model is None or _local_model_name != model_name:
        _local_model = _load_local_whisper_model(model_name)
        _local_model_name = model_name
RAW_BUFFERClick to expand / collapse

Description

tools/transcription_tools.py caches the local Whisper model in _local_model / _local_model_name (lines 104–105). The lazy-init guard in _transcribe_local() is:

_local_model: Optional[object] = None
_local_model_name: Optional[str] = None

def _transcribe_local(file_path: str, model_name: str):
    global _local_model, _local_model_name
    ...
    if _local_model is None or _local_model_name != model_name:
        _local_model = _load_local_whisper_model(model_name)
        _local_model_name = model_name

There is no lock. When two threads call _transcribe_local() concurrently before the first load completes, both pass the is None guard and both call _load_local_whisper_model() — triggering duplicate model downloads (~150 MB on first use) and wasting memory loading the same model twice.

Impact

  • First-time use: two concurrent transcriptions cause duplicate ~150 MB model downloads.
  • Model reload on switch: concurrent calls with the same new model name each reload the model, wasting GPU/CPU memory.
  • No threading import in this file — the fix requires adding one.

Fix

Add import threading, _local_model_lock = threading.Lock(), and apply double-checked locking in _transcribe_local().

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING