hermes - ✅(Solved) Fix [Bug]: Background review agent and curator can overwrite bundled/hub skills via skill_manage [1 pull requests, 3 comments, 3 participants]

Q: Expected behavior

`skill_manage` should refuse write operations (edit, patch, delete, write_file, remove_file) on bundled and hub-installed skills at the tool level, returning a clear error. The background review prompt should also include an explicit "DO NOT touch bundled or hub-installed skills" instruction as defense-in-depth.

DanielMaly · 2026-05-05T15:24:54Z

[hermes] PR 20560: fix security : guard bundled and hub skills from skill manage writes - Repository: NousResearch/hermes-agent - Author: steezkelly - State: o… # PR #20560: fix(security): guard bundled and hub skills from skill_manage writes - Repository: NousResearch/hermes-agent - Author: steezkelly - State: open | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/20560 ## Description (problem / solution / changelog) ## Summary - Add a tool-level provenance guard that refuses `skill_manage` mutations for bundled and hub-installed skills. - Apply the guard to `edit`, `patch`, `delete`, `write_file`, and `remove_file` before any filesystem mutation. - Use the existing `tools.skill_usage.is_agent_created()` provenance path instead of inventing a new marker, so bundled `.bundled_manifest` and hub `.hub/lock.json` state remain the source of truth. - Add regression tests for bundled manifest protection and nested hub lock install paths. Fixes #20273. ## Why this approach The current boundary is prompt-level only: autonomous review/curator prompts may say not to touch bundled or hub-installed skills, but `skill_manage` itself still accepts writes. That makes prompt injection against autonomous skill review a persistent code/content mutation path. This PR enforces the boundary in the mutation tool itself. If legitimate customization is desired, the safe path is to copy/fork the skill into a new local skill name first. This also supersedes the open #19379 approach in one important way: this PR uses the existing provenance machinery (`is_agent_created()`, backed by bundled manifest and hub lock metadata) rather than a `.hub-source` marker. ## Verification Clean worktree created from current `origin/main`, then cherry-picked only this security fix. ```text scripts/run_tests.sh tests/tools/test_skill_manager_tool.py 88 passed python -m py_compile tools/skill_manager_tool.py tests/tools/test_skill_manager_tool.py passed git diff --check origin/main...HEAD clean ``` I also checked focused ruff on these files: ```text ruff check tools/skill_manager_tool.py tests/tools/test_skill_manager_tool.py ``` That command still reports the same pre-existing baseline lint issues on current `origin/main` for these files (unused imports / E702 semicolons in existing tests, E402 imports in the tool module). This PR does not introduce those baseline lint findings. ## Changed files - `tests/tools/test_skill_manager_tool.py` (modified, +67/-0) - `tools/skill_manager_tool.py` (modified, +49/-3) ## Fix / Workaround `skill_manage` has no code-level write guard for bundled or hub-installed skills. The only protections are `_pinned_guard()` (blocks pinned skills) and `_security_scan_skill()` (blocks dangerous content, off by default). Any agent session with access to `skill_manage` can freely edit, patch, or delete bundled skills. This prompt has **no instruction to avoid bundled or hub-installed skills**. Combined with the lack of a code guard, the background review agent can and does patch bundled skills during normal conversations. The main agent is unaware — the review runs in a background thread after the response is delivered. 1. Have bundled skills installed in `~/.hermes/skills/` 2. Have a conversation that triggers the background review (≥10 tool iterations, configurable via `skills.creation_nudge_interval`) 3. The background review agent can call `skill_manage(action='patch', name=' ', ...)` — no error is returned 4. The bundled skill is now modified in `~/.hermes/skills/` 5. On next `hermes update`, the skills sync detects the hash divergence and prints `~ N user-modified (kept)` — the agent's modification is **silently preserved**, not overwritten. There is no path back to the upstream version short of manually deleting the skill directory. ## Describe the bug `skill_manage` has no code-level write guard for bundled or hub-installed skills. The only protections are `_pinned_guard()` (blocks pinned skills) and `_security_scan_skill()` (blocks dangerous content, off by default). Any agent session with access to `skill_manage` can freely edit, patch, or delete bundled skills. This affects two autonomous subsystems that run without user supervision: ### 1. Background review agent (most urgent) `_spawn_background_review()` (`run_agent.py:3465`) runs after every conversation turn with ≥10 tool iterations. It forks an AIAgent with `enabled_toolsets=["memory", "skills"]` and a prompt (`_SKILL_REVIEW_PROMPT` at `run_agent.py:3270`) that says: > *"Be ACTIVE — most sessions produce at least one skill update, even if small. A pass that does nothing is a missed learning opportunity."* This prompt has **no instruction to avoid bundled or hub-installed skills**. Combined with the lack of a code guard, the background review agent can and does patch bundled skills during normal conversations. The main agent is unaware — the review runs in a background thread after the response is delivered. ### 2. Curator The c

hermes2026-05-05 15:24:54

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#20273•Fetched 2026-05-06 06:37:36

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

labeled ×4commented ×3cross-referenced ×2

Error Message

The background review agent can call skill_manage(action='patch', name='<bundled-skill>', ...) — no error is returned skill_manage should refuse write operations (edit, patch, delete, write_file, remove_file) on bundled and hub-installed skills at the tool level, returning a clear error. The background review prompt should also include an explicit "DO NOT touch bundled or hub-installed skills" instruction as defense-in-depth.

Fix Action

Fix / Workaround

skill_manage has no code-level write guard for bundled or hub-installed skills. The only protections are _pinned_guard() (blocks pinned skills) and _security_scan_skill() (blocks dangerous content, off by default). Any agent session with access to skill_manage can freely edit, patch, or delete bundled skills.

This prompt has no instruction to avoid bundled or hub-installed skills. Combined with the lack of a code guard, the background review agent can and does patch bundled skills during normal conversations. The main agent is unaware — the review runs in a background thread after the response is delivered.

Have bundled skills installed in ~/.hermes/skills/
Have a conversation that triggers the background review (≥10 tool iterations, configurable via skills.creation_nudge_interval)
The background review agent can call skill_manage(action='patch', name='<bundled-skill>', ...) — no error is returned
The bundled skill is now modified in ~/.hermes/skills/
On next hermes update, the skills sync detects the hash divergence and prints ~ N user-modified (kept) — the agent's modification is silently preserved, not overwritten. There is no path back to the upstream version short of manually deleting the skill directory.

PR fix notes

PR #20560: fix(security): guard bundled and hub skills from skill_manage writes

Repository: NousResearch/hermes-agent
Author: steezkelly
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/20560

Description (problem / solution / changelog)

Summary

Add a tool-level provenance guard that refuses skill_manage mutations for bundled and hub-installed skills.
Apply the guard to edit, patch, delete, write_file, and remove_file before any filesystem mutation.
Use the existing tools.skill_usage.is_agent_created() provenance path instead of inventing a new marker, so bundled .bundled_manifest and hub .hub/lock.json state remain the source of truth.
Add regression tests for bundled manifest protection and nested hub lock install paths.

Fixes #20273.

Why this approach

The current boundary is prompt-level only: autonomous review/curator prompts may say not to touch bundled or hub-installed skills, but skill_manage itself still accepts writes. That makes prompt injection against autonomous skill review a persistent code/content mutation path.

This PR enforces the boundary in the mutation tool itself. If legitimate customization is desired, the safe path is to copy/fork the skill into a new local skill name first.

This also supersedes the open #19379 approach in one important way: this PR uses the existing provenance machinery (is_agent_created(), backed by bundled manifest and hub lock metadata) rather than a .hub-source marker.

Verification

Clean worktree created from current origin/main, then cherry-picked only this security fix.

scripts/run_tests.sh tests/tools/test_skill_manager_tool.py
88 passed

python -m py_compile tools/skill_manager_tool.py tests/tools/test_skill_manager_tool.py
passed

git diff --check origin/main...HEAD
clean

I also checked focused ruff on these files:

ruff check tools/skill_manager_tool.py tests/tools/test_skill_manager_tool.py

That command still reports the same pre-existing baseline lint issues on current origin/main for these files (unused imports / E702 semicolons in existing tests, E402 imports in the tool module). This PR does not introduce those baseline lint findings.

Changed files

tests/tools/test_skill_manager_tool.py (modified, +67/-0)
tools/skill_manager_tool.py (modified, +49/-3)

RAW_BUFFERClick to expand / collapse

Describe the bug

This affects two autonomous subsystems that run without user supervision:

1. Background review agent (most urgent)

_spawn_background_review() (run_agent.py:3465) runs after every conversation turn with ≥10 tool iterations. It forks an AIAgent with enabled_toolsets=["memory", "skills"] and a prompt (_SKILL_REVIEW_PROMPT at run_agent.py:3270) that says:

"Be ACTIVE — most sessions produce at least one skill update, even if small. A pass that does nothing is a missed learning opportunity."

2. Curator

The curator (agent/curator.py) runs on a 7-day schedule. Its prompt (CURATOR_REVIEW_PROMPT at line 261) does include the instruction "DO NOT touch bundled or hub-installed skills", but this is a prompt-level instruction, not an enforced boundary. A poisoned community skill could inject instructions that override it (prompt injection → persistent code modification via skill_manage).

To reproduce

Have bundled skills installed in ~/.hermes/skills/
Have a conversation that triggers the background review (≥10 tool iterations, configurable via skills.creation_nudge_interval)
The background review agent can call skill_manage(action='patch', name='<bundled-skill>', ...) — no error is returned
The bundled skill is now modified in ~/.hermes/skills/
On next hermes update, the skills sync detects the hash divergence and prints ~ N user-modified (kept) — the agent's modification is silently preserved, not overwritten. There is no path back to the upstream version short of manually deleting the skill directory.

Why this is a problem

The issue is not that edits get overwritten — they don't. The skills sync (tools/skills_sync.py) uses hash-based detection: if the user copy differs from the origin hash, it's treated as "user-modified" and skipped. This means:

Silent corruption — The agent patches a bundled skill, the sync preserves the patch, and neither the user nor the sync reports anything wrong. The ~ N user-modified (kept) message sounds benign, like intentional customization.
No path back to upstream — Once the hash diverges, hermes update will never restore the original bundled content. The user's copy is permanently forked with whatever the agent wrote.
The sync can't distinguish agent accidents from user intent — Both look like hash divergence. The agent's unintended edits are indistinguishable from deliberate user customization.

Expected behavior

skill_manage should refuse write operations (edit, patch, delete, write_file, remove_file) on bundled and hub-installed skills at the tool level, returning a clear error. The background review prompt should also include an explicit "DO NOT touch bundled or hub-installed skills" instruction as defense-in-depth.

Related issues and PRs

#19379 — PR that adds _bundled_hub_guard() mirroring _pinned_guard(), which would fix both paths. Currently has failing tests and merge conflicts.
#1780 — Proposes a diff/merge layer to preserve agent edits to bundled skills across updates. This is related but takes the opposite view: it assumes the agent editing bundled skills is intentional and worth preserving. If this bug is fixed (writes blocked), #1780 becomes moot — there would be no agent edits to preserve. If a future "fork and customize" workflow is desired, it should be explicit opt-in, not an accidental side-effect of an unguarded tool.

Additional context

SKILLS_GUIDANCE in agent/prompt_builder.py:176 (injected into every session where skill_manage is available) actively encourages the behavior: "When using a skill and finding it outdated, incomplete, or wrong, patch it immediately — don't wait to be asked." No caveat about bundled vs. agent-created skills.
The curator.auxiliary.model / curator.auxiliary.provider config options defined in hermes_cli/config.py:952 are not wired up — the curator always uses the main model config regardless of these settings.

extent analysis

TL;DR

Implement a code-level write guard in skill_manage to prevent modifications to bundled and hub-installed skills.

Guidance

Add a check in skill_manage to return an error when attempting to edit, patch, or delete bundled or hub-installed skills.
Update the background review prompt to include an explicit instruction to avoid modifying bundled or hub-installed skills.
Consider merging PR #19379 to add a _bundled_hub_guard() function, which would block write operations on bundled and hub-installed skills.
Review and update SKILLS_GUIDANCE in agent/prompt_builder.py to include a caveat about not modifying bundled skills.

Example

def skill_manage(action, name, ...):
    if is_bundled_or_hub_installed(name):
        return "Error: Cannot modify bundled or hub-installed skills"
    # existing code

Notes

The current implementation of skill_manage lacks a code-level write guard, allowing the background review agent and curator to modify bundled skills. This can lead to silent corruption and permanent forking of skills. The proposed fix involves adding a check to prevent modifications to bundled and hub-installed skills.

Recommendation

Apply the workaround by adding a code-level write guard to skill_manage to prevent modifications to bundled and hub-installed skills, as this will address the immediate issue and prevent further corruption.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#indexing error #inference speed #output truncation #response parsing #generation error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix [Bug]: Background review agent and curator can overwrite bundled/hub skills via skill_manage [1 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

PR fix notes

PR #20560: fix(security): guard bundled and hub skills from skill_manage writes

Description (problem / solution / changelog)

Summary

Why this approach

Verification

Changed files

Describe the bug

1. Background review agent (most urgent)

2. Curator

To reproduce

Why this is a problem

Expected behavior

Related issues and PRs

Additional context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING