hermes - 💡(How to fix) Fix [Bug]: hermes skills reset --restore corrupts manifest when rmtree fails on read-only Nix-store dirs

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

if restore: if not is_bundled: return {... "bundled_missing" ...} dest = _compute_relative_dest(bundled_by_name[name], bundled_dir) if dest.exists(): # Make the tree writable first — Nix-store originals copied via # shutil.copytree preserve r-xr-xr-x modes which break rmtree. for root, dirs, files in os.walk(dest): try: os.chmod(root, 0o755) except OSError: pass for f in files: try: os.chmod(os.path.join(root, f), 0o644) except OSError: pass try: shutil.rmtree(dest) deleted_user_copy = True except (OSError, IOError) as e: # Don't touch the manifest if rmtree failed. return { "ok": False, "action": "rmtree_failed", "message": f"Could not delete user copy at {dest}: {e}", ... }

Only NOW drop the manifest entry, after we know the rmtree succeeded.

if in_manifest: del manifest[name] _write_manifest(manifest)

Root Cause

  1. Run Hermes from a Nix-store install where bundled skills are copied to ~/.hermes/skills/ with read-only permissions (r-xr-xr-x).
  2. Pick a bundled skill where basename(dir) != frontmatter name: (e.g. audiocraft-audio-generation lives in folder mlops/models/audiocraft, serving-llms-vllm in mlops/inference/vllm, evaluating-llms-harness in lm-evaluation-harness, ideation in creative-ideation, segment-anything-model in segment-anything).
  3. Run: hermes skills reset <frontmatter-name> --restore --yes.
  4. Observe: command exits with rc=0 but stderr contains:
    Cleared manifest entry for 'audiocraft-audio-generation' but could not delete user copy at <path>: [Errno 13] Permission denied
  5. Inspect the manifest: the entry for that skill is now gone.
  6. Restart Hermes (or run sync_skills() manually). The skill is no longer tracked. audit_bundled_skill_drift.py reports it as user-modified-UNTRACKED. Subsequent sync_skills runs do not re-baseline it because of the not_in_manifest && dir_hash(dest) != bundled_hash skip path.

Fix Action

Workaround

Pre-chmod before invoking reset:

DEST=$(grep -rl "^name: $skill\$" ~/.hermes/skills --include=SKILL.md | grep -v '\.bak/\|.restore-backups' | head -1)
chmod -R u+w "$(dirname "$DEST")"
hermes skills reset "$skill" --restore --yes

This is what ~/.hermes/skills/devops/bundled-skill-patches/scripts/hermes_skill_sync.py does internally before each reset.

Code Example

Cleared manifest entry for 'audiocraft-audio-generation' but could not delete user copy at <path>: [Errno 13] Permission denied

---

# tools/skills_sync.py — reset_bundled_skill, current sequence:

# Step 1: drop the manifest entry so next sync treats it as new
if in_manifest:
    del manifest[name]
    _write_manifest(manifest)        # ← manifest already mutated

# Step 2 (optional): delete the user's copy so next sync re-copies bundled
deleted_user_copy = False
if restore:
    ...
    if dest.exists():
        try:
            shutil.rmtree(dest)      # ← can fail on read-only Nix-copytree dirs
            deleted_user_copy = True
        except (OSError, IOError) as e:
            return {                 # ← bails out with manifest already corrupted
                "ok": False,
                "action": "manifest_cleared",
                "message": (
                    f"Cleared manifest entry for '{name}' but could not "
                    f"delete user copy at {dest}: {e}"
                ),
                ...
            }

---

if restore:
    if not is_bundled:
        return {... "bundled_missing" ...}
    dest = _compute_relative_dest(bundled_by_name[name], bundled_dir)
    if dest.exists():
        # Make the tree writable first — Nix-store originals copied via
        # shutil.copytree preserve `r-xr-xr-x` modes which break rmtree.
        for root, dirs, files in os.walk(dest):
            try:
                os.chmod(root, 0o755)
            except OSError:
                pass
            for f in files:
                try:
                    os.chmod(os.path.join(root, f), 0o644)
                except OSError:
                    pass
        try:
            shutil.rmtree(dest)
            deleted_user_copy = True
        except (OSError, IOError) as e:
            # Don't touch the manifest if rmtree failed.
            return {
                "ok": False,
                "action": "rmtree_failed",
                "message": f"Could not delete user copy at {dest}: {e}",
                ...
            }

# Only NOW drop the manifest entry, after we know the rmtree succeeded.
if in_manifest:
    del manifest[name]
    _write_manifest(manifest)

---

DEST=$(grep -rl "^name: $skill\$" ~/.hermes/skills --include=SKILL.md | grep -v '\.bak/\|.restore-backups' | head -1)
chmod -R u+w "$(dirname "$DEST")"
hermes skills reset "$skill" --restore --yes
RAW_BUFFERClick to expand / collapse

Bug Description

hermes skills reset <name> --restore --yes (and the tools.skills_sync.reset_bundled_skill(restore=True) underlying it) can leave a bundled skill in a user-modified-UNTRACKED-like state when the on-disk skill directory contains read-only files inherited from an immutable package source (Nix store).

The transactional order in reset_bundled_skill is:

  1. del manifest[name] then _write_manifest(manifest) — manifest is rewritten without the entry.
  2. shutil.rmtree(dest) — the local skill copy is supposed to be removed.
  3. sync_skills(quiet=True) — re-baselines the skill from bundled.

If step 2 fails (e.g. read-only directories from shutil.copytree of a Nix-store source preserving modes), the function returns an error early and never reaches the re-sync. But the manifest entry is already gone. The local skill is now in a manifest-less state: sync_skills will see "skill_name not in manifest && dest exists && dir_hash(dest) != bundled_hash" and silently skip it forever (skill is permanently stuck without a manifest entry to baseline it).

This is related to but distinct from #34860 (stale .bak after sync) and #29856 (misleading success message without --restore).

Steps to Reproduce

  1. Run Hermes from a Nix-store install where bundled skills are copied to ~/.hermes/skills/ with read-only permissions (r-xr-xr-x).
  2. Pick a bundled skill where basename(dir) != frontmatter name: (e.g. audiocraft-audio-generation lives in folder mlops/models/audiocraft, serving-llms-vllm in mlops/inference/vllm, evaluating-llms-harness in lm-evaluation-harness, ideation in creative-ideation, segment-anything-model in segment-anything).
  3. Run: hermes skills reset <frontmatter-name> --restore --yes.
  4. Observe: command exits with rc=0 but stderr contains:
    Cleared manifest entry for 'audiocraft-audio-generation' but could not delete user copy at <path>: [Errno 13] Permission denied
  5. Inspect the manifest: the entry for that skill is now gone.
  6. Restart Hermes (or run sync_skills() manually). The skill is no longer tracked. audit_bundled_skill_drift.py reports it as user-modified-UNTRACKED. Subsequent sync_skills runs do not re-baseline it because of the not_in_manifest && dir_hash(dest) != bundled_hash skip path.

Reproduced locally on macOS / Hermes 0.15.1 (Nix install) when re-syncing 67 stale-bundled skills after upgrade from 0.11.x. The five skills with frontmatter-name ≠ folder-name all hit this path.

Expected Behavior

Either of:

  • (a) reset_bundled_skill(restore=True) should chmod -R u+w dest before shutil.rmtree, mirroring what user-space wrappers like hermes_skill_sync.py already do (subprocess.check_call(["chmod", "-R", "u+w", str(target)])).
  • (b) The manifest entry should only be deleted after shutil.rmtree(dest) succeeds. Current order is non-transactional: failure mid-flight corrupts the manifest.

Ideally both: chmod first, then delete, then (only after both succeed) drop the manifest entry, then re-sync.

Actual Behavior

# tools/skills_sync.py — reset_bundled_skill, current sequence:

# Step 1: drop the manifest entry so next sync treats it as new
if in_manifest:
    del manifest[name]
    _write_manifest(manifest)        # ← manifest already mutated

# Step 2 (optional): delete the user's copy so next sync re-copies bundled
deleted_user_copy = False
if restore:
    ...
    if dest.exists():
        try:
            shutil.rmtree(dest)      # ← can fail on read-only Nix-copytree dirs
            deleted_user_copy = True
        except (OSError, IOError) as e:
            return {                 # ← bails out with manifest already corrupted
                "ok": False,
                "action": "manifest_cleared",
                "message": (
                    f"Cleared manifest entry for '{name}' but could not "
                    f"delete user copy at {dest}: {e}"
                ),
                ...
            }

The error message is honest about the symptom but the side effect (manifest entry dropped) is not advertised, and the skill is now in a state that no further sync_skills run will fix without manual intervention.

Affected Component

Skills (skill loading, skill hub, skill guard)

Messaging Platform

N/A (CLI only)

Suggested Fix

if restore:
    if not is_bundled:
        return {... "bundled_missing" ...}
    dest = _compute_relative_dest(bundled_by_name[name], bundled_dir)
    if dest.exists():
        # Make the tree writable first — Nix-store originals copied via
        # shutil.copytree preserve `r-xr-xr-x` modes which break rmtree.
        for root, dirs, files in os.walk(dest):
            try:
                os.chmod(root, 0o755)
            except OSError:
                pass
            for f in files:
                try:
                    os.chmod(os.path.join(root, f), 0o644)
                except OSError:
                    pass
        try:
            shutil.rmtree(dest)
            deleted_user_copy = True
        except (OSError, IOError) as e:
            # Don't touch the manifest if rmtree failed.
            return {
                "ok": False,
                "action": "rmtree_failed",
                "message": f"Could not delete user copy at {dest}: {e}",
                ...
            }

# Only NOW drop the manifest entry, after we know the rmtree succeeded.
if in_manifest:
    del manifest[name]
    _write_manifest(manifest)

Workaround

Pre-chmod before invoking reset:

DEST=$(grep -rl "^name: $skill\$" ~/.hermes/skills --include=SKILL.md | grep -v '\.bak/\|.restore-backups' | head -1)
chmod -R u+w "$(dirname "$DEST")"
hermes skills reset "$skill" --restore --yes

This is what ~/.hermes/skills/devops/bundled-skill-patches/scripts/hermes_skill_sync.py does internally before each reset.

Debug Report

Hermes Agent v0.15.1 (2026.5.29), Nix install on macOS (APFS).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Bug]: hermes skills reset --restore corrupts manifest when rmtree fails on read-only Nix-store dirs