hermes - ✅(Solved) Fix Curator umbrella-skill consolidation can leave cron jobs with stale skill references [1 pull requests, 3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#18671Fetched 2026-05-03 04:55:02
View on GitHub
Comments
3
Participants
3
Timeline
10
Reactions
0
Author
Timeline (top)
labeled ×4commented ×3closed ×1cross-referenced ×1

Root Cause

Cron jobs are unattended by design. A missing skill warning can be easy to miss, and a job can still complete successfully while running with weaker guidance than intended.

This is especially risky for jobs that perform publishing, posting, reporting, or other external side effects where the attached skills encode guardrails and QA rules.

Fix Action

Fixed

PR fix notes

PR #18731: fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671)

Description (problem / solution / changelog)

Summary

Two fixes for the curator + cron-link silent-failure class. Closes #18671.

  1. absorbed_into on skill delete — curator reconciler stops guessing what "archived" means.
  2. Cron skill links are backed up with the snapshot and restored on rollback — rolling back a curator run actually returns cron jobs to their pre-run state.

1. absorbed_into on skill delete

Root cause

_reconcile_classification in agent/curator.py inferred consolidation vs pruning from two brittle signals: the curator's post-hoc YAML summary block, and a substring heuristic scanning sibling tool calls for the removed skill's name. Both miss in real consolidations — models forget the YAML under reasoning pressure, and the heuristic misses when the umbrella's patch content describes the absorbed behavior abstractly instead of literally naming the old slug. When both miss, the skill fell through to "no-evidence fallback" pruned, and #18253's cron-rewriter then dropped the cron ref entirely instead of mapping it to the umbrella. Same observable symptom as pre-#18253: Skill(s) not found and skipped on the next cron run.

Changes

  • tools/skill_manager_tool.pyskill_manage(action='delete') accepts absorbed_into:
    • absorbed_into='<umbrella>' → consolidated; target must exist on disk (validated)
    • absorbed_into='' → explicit prune, no forwarding target
    • missing → legacy path, reconciler falls through to heuristic/YAML (backward compat)
    • rejects absorbed_into=<self> and nonexistent targets
  • agent/curator.py — new _extract_absorbed_into_declarations() pulls declarations off llm_meta.tool_calls. _reconcile_classification accepts absorbed_declarations= and treats it as authoritative — beats YAML block and heuristic. Curator prompt updated to require the arg on every delete.

2. Cron skill links through snapshot + rollback

Root cause

snapshot_skills() captured the skills tree and .curator_backups/…/skills.tar.gz held it safely, but ~/.hermes/cron/jobs.json was never captured. After a rollback, skills bounced back to disk but cron jobs still pointed at whatever umbrellas the curator had rewritten them to. User experience: "I rolled back but my cron jobs still use the merged skills."

Changes

  • agent/curator_backup.pysnapshot_skills() additionally copies cron/jobs.json as cron-jobs.json alongside the tarball. Manifest gains a cron_jobs block (backed_up, jobs_count, optional reason/parse_warning).
  • agent/curator_backup.py — new _restore_cron_skill_links(snapshot_dir) reconciles backed-up skills into the live jobs.json surgically:
    • only skills/skill fields touched; schedule/prompt/timestamps/enabled/etc. are live state and preserved
    • matched by job id; jobs the user deleted after the snapshot are NOT resurrected; jobs the user created after are untouched
    • writes through cron.jobs.save_jobs() under the same _jobs_file_lock the scheduler uses — no race with tick()
    • failures here don't fail the overall rollback (skills tree is the core guarantee)
  • rollback() calls _restore_cron_skill_links after the skills extract succeeds; the returned message summarizes the reconciliation ("cron links: N job(s) had skill links restored, M backed-up job(s) no longer exist").
  • hermes_cli/curator.py — rollback confirm dialog shows cron-backup status from the manifest so the user knows what's about to happen.

Validation

BeforeAfter
Model consolidates, emits YAML, heuristic hits
Model consolidates, forgets YAML, heuristic misses✗ fell through to prune, cron ref droppedabsorbed_into declared → cron rewritten
Model truly prunesinferredexplicit absorbed_into=""
Rollback restores skills tree
Rollback restores cron skill links✗ jobs still point at umbrellas✓ surgical restore; non-skill fields preserved
Rollback with pre-feature snapshot (no cron-jobs.json)n/a✓ skills tree still restored; cron untouched; no error
User deleted a job after snapshotn/a✓ not resurrected
User added a job after snapshotn/a✓ untouched

Test counts

  • Targeted: 484/484 pass (tests/agent/test_curator*.py + tests/cron/ + tests/tools/test_skill_manager_tool.py).
  • 18 new tests for absorbed_into (tool contract + extractor + reconciler + mixed legacy-and-declared runs).
  • 9 new tests for cron-link backup/rollback (snapshot shapes, rollback reconciliation rules, pre-feature-snapshot compatibility, standalone unit on _restore_cron_skill_links).

E2E

  • #18671 classification repro: umbrella + 3 narrow skills, cron job referencing all 3. Model emits no YAML, heuristic misses. Delete calls carry absorbed_into. Result: PR skills correctly classified consolidated, cron rewritten to ['hermes-agent-dev'], stale-junk pruned via absorbed_into="".
  • Backward-compat classification: delete without absorbed_into, model emits YAML → routed via existing "model" source, cron still rewritten correctly. Legacy path untouched.
  • Full snapshot → curator rewrite → rollback: cron job skills=[pr-review-format, pr-review-checklist, pr-triage-salvage]. Snapshot captures cron. Curator rewrites skills to [hermes-agent-dev]. Rollback restores both the skills tree AND the cron skills list to the original three names. Non-skill cron fields (id, name, prompt) preserved across the round trip.

Changed files

  • agent/curator.py (modified, +108/-1)
  • agent/curator_backup.py (modified, +257/-4)
  • hermes_cli/curator.py (modified, +13/-1)
  • tests/agent/test_curator_backup.py (modified, +278/-0)
  • tests/agent/test_curator_classification.py (modified, +263/-0)
  • tests/tools/test_skill_manager_tool.py (modified, +70/-0)
  • tools/skill_manager_tool.py (modified, +60/-5)

Code Example

Skill(s) not found and skipped: <old-skill-a>, <old-skill-b>, <old-skill-c>

---

skills: [old-skill-a, old-skill-b, old-skill-c]

---

old-skill-a -> umbrella-skill
   old-skill-b -> umbrella-skill
   old-skill-c -> umbrella-skill

---

Skill(s) not found and skipped: <old-skill-a>, <old-skill-b>, <old-skill-c>
RAW_BUFFERClick to expand / collapse

Bug Description

When the skill curator/consolidation workflow replaces a set of narrow skills with umbrella skills, scheduled cron jobs that reference the old skill names can keep stale skills entries.

On the next cron execution, Hermes reports warnings such as:

Skill(s) not found and skipped: <old-skill-a>, <old-skill-b>, <old-skill-c>

The job may continue, but it starts without the intended procedural context unless the cron definition is manually edited.

This looks related to the scenario addressed by #18253, but the behavior is still observable after umbrella-skill consolidation in a live cron setup.

Steps to Reproduce

  1. Create or have an existing cron job with attached skills:
    skills: [old-skill-a, old-skill-b, old-skill-c]
  2. Run skill curation/consolidation so these skills are archived/absorbed into a new umbrella skill, for example:
    old-skill-a -> umbrella-skill
    old-skill-b -> umbrella-skill
    old-skill-c -> umbrella-skill
  3. List the active skills: the old names are no longer resolvable as top-level skills.
  4. Let the cron run, or trigger it manually.

Expected Behavior

After consolidation, cron job skill references should be migrated automatically or flagged with an actionable migration report.

Possible acceptable behaviors:

  • rewrite cron skills references from removed skills to the correct umbrella skill(s);
  • preserve aliases for archived skill names so existing crons keep loading the umbrella context;
  • block/require approval before archiving a skill that is referenced by an enabled cron;
  • emit a curator report listing every cron that will need a skill-reference migration before any mutation happens.

Actual Behavior

The cron keeps stale skill names and logs/skips them at runtime:

Skill(s) not found and skipped: <old-skill-a>, <old-skill-b>, <old-skill-c>

This creates a silent quality/regression risk for scheduled jobs: the cron may still finish with ok, but it did not receive the procedural context it was configured to use.

Why This Matters

Cron jobs are unattended by design. A missing skill warning can be easy to miss, and a job can still complete successfully while running with weaker guidance than intended.

This is especially risky for jobs that perform publishing, posting, reporting, or other external side effects where the attached skills encode guardrails and QA rules.

Related

  • #18373 — broader curator safety / user-skill consolidation issue
  • #18253 — appears intended to handle cron skill ref rewrites after consolidation

Environment

Observed on a normal Hermes Agent install with recurring cron jobs and skill curator/umbrella-skill consolidation. No install-specific paths, private skill names, or private cron content are included here.

extent analysis

TL;DR

Update cron job definitions to reference the new umbrella skill after skill consolidation to prevent stale skill entries.

Guidance

  • Review and update cron job definitions to use the new umbrella skill name instead of the archived skill names.
  • Consider implementing a migration report or approval process to flag cron jobs that need skill-reference updates before skill consolidation.
  • Verify that the updated cron job runs successfully with the intended procedural context by checking the job logs for any warnings or errors.
  • Test the cron job with different scenarios to ensure it behaves as expected after the skill consolidation.

Example

No code snippet is provided as the issue does not require a specific code change, but rather a process update.

Notes

The current behavior may be related to the scenario addressed by #18253, but the issue persists after umbrella-skill consolidation. The proposed solution focuses on updating cron job definitions and implementing a migration report or approval process.

Recommendation

Apply workaround: Update cron job definitions to reference the new umbrella skill after skill consolidation, as this is a more immediate and practical solution to prevent stale skill entries and ensure cron jobs run with the intended procedural context.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix Curator umbrella-skill consolidation can leave cron jobs with stale skill references [1 pull requests, 3 comments, 3 participants]