hermes - 💡(How to fix) Fix FileSyncManager: a failed sync() advances the rate-limit clock, suppressing the documented retry

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

except Exception as exc: self._synced_files = prev_files self._pushed_hashes = prev_hashes self._last_sync_time = time.monotonic() # <-- bumps the rate-limit clock on FAILURE logger.warning("file_sync: sync failed, rolled back state: %s", exc)

Fix Action

Fix

Don't update _last_sync_time on the failure path; let the clock advance only on a successful or no-op cycle.

Code Example

except Exception as exc:
    self._synced_files = prev_files
    self._pushed_hashes = prev_hashes
    self._last_sync_time = time.monotonic()   # <-- bumps the rate-limit clock on FAILURE
    logger.warning("file_sync: sync failed, rolled back state: %s", exc)

---

if now - self._last_sync_time < self._sync_interval:
    return
RAW_BUFFERClick to expand / collapse

Bug description

FileSyncManager.sync() in tools/environments/file_sync.py is rate-limited to once per _sync_interval (default _SYNC_INTERVAL_SECONDS = 5.0) via _last_sync_time. Its docstring states:

Transactional: state only committed if ALL operations succeed. On failure, state rolls back so the next cycle retries everything.

However, the failure path bumps the rate-limit clock, so the promised retry is suppressed:

except Exception as exc:
    self._synced_files = prev_files
    self._pushed_hashes = prev_hashes
    self._last_sync_time = time.monotonic()   # <-- bumps the rate-limit clock on FAILURE
    logger.warning("file_sync: sync failed, rolled back state: %s", exc)

The rate-limit guard at the top of sync() then short-circuits the next non-forced call:

if now - self._last_sync_time < self._sync_interval:
    return

So after a failed cycle, the next non-forced sync() within _sync_interval returns early without retrying, even though the state was just rolled back specifically so it could retry.

Why it matters

The rate-limited (non-forced) sync() runs before every command on the remote backends:

  • tools/environments/ssh.py:286
  • tools/environments/modal.py:402
  • tools/environments/daytona.py:217

A single transient upload failure (network blip, dropped SSH channel) therefore leaves the remote with stale files for the next command, up to _sync_interval (default 5s). Forced syncs (connect/setup paths, force=True) bypass the guard, which is why the symptom is intermittent and easy to miss.

Reproduction

  1. Use any push-based backend (SSH/Modal/Daytona) — or FileSyncManager directly.
  2. Make an upload_fn raise once to simulate a transient transport failure during sync().
  3. Within _sync_interval, trigger another (non-forced) sync().
  4. Observe the second sync returns immediately and does not re-upload the rolled-back files.

Expected

After a failed cycle, the next cycle retries (per the docstring) — the failed sync should not advance the rate-limit clock.

Actual

The failed cycle advances _last_sync_time, so the next non-forced cycle is rate-limited away and the remote stays stale until the interval elapses.

Fix

Don't update _last_sync_time on the failure path; let the clock advance only on a successful or no-op cycle.

Environment

  • Affected file: tools/environments/file_sync.py (current main)
  • Backends impacted: SSH, Modal, Daytona (push-based file sync)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING