hermes - 💡(How to fix) Fix RFC: OS-style skill scheduling (priority queues, resource budgets, deadlock detection, layered loading)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

I'd like to gauge interest in upstreaming an OS-style scheduling layer for skills — treating skills as bounded "processes" governed by a small kernel rather than capability-discovery functions auto-loaded on trigger match.

This is orthogonal to "skills self-improve during use" (which is an ML angle) — it's a resource-governance angle that becomes relevant in multi-profile / heavy-tool deployments where the current ad-hoc model starts to break down.

I've been running a partial version of this in production for ~3 weeks. Not all of it is in code yet — some pieces are enforced as conventions in CLAUDE.md-style docs and would benefit from kernel-level enforcement.

Root Cause

I'd like to gauge interest in upstreaming an OS-style scheduling layer for skills — treating skills as bounded "processes" governed by a small kernel rather than capability-discovery functions auto-loaded on trigger match.

This is orthogonal to "skills self-improve during use" (which is an ML angle) — it's a resource-governance angle that becomes relevant in multi-profile / heavy-tool deployments where the current ad-hoc model starts to break down.

I've been running a partial version of this in production for ~3 weeks. Not all of it is in code yet — some pieces are enforced as conventions in CLAUDE.md-style docs and would benefit from kernel-level enforcement.

Code Example

# skills/my-skill/SKILL.md frontmatter
name: my-skill
priority: 50              # 0-100; default 50; higher = preempts lower
tier: domain              # meta | platform | engineering | domain | creative
budget:
  max_tokens: 8000
  max_tool_calls: 10
  max_wall_seconds: 60
writes_to:                # for mutex / queue (see RFC #31385... err, separate proposal)
  - .hermes/state/foo.json
depends_on:               # for deadlock detection
  - bash
  - file-read

---

meta        → always loaded (skill discovery, kernel rules)
platform    → loaded when on the corresponding transport (Feishu/Discord/etc.)
engineering → loaded when toolset includes Bash/Read/Edit/Grep
domain      → loaded on explicit slash command or strong context match
creative    → loaded only on explicit invocation

---

when a skill is triggered:
  if priority(new) > priority(running) and not running.holding_lock:
    preempt running; resume after new completes
  else:
    enqueue at priority

while running:
  check budget (tokens, tool calls, wall time)
  on budget exhausted:
    fail-soft with partial result + budget_exceeded marker

on skill chain:
  maintain call graph; if cycle detected, refuse with explanation

---

{"t":"2026-05-24T09:00:00Z","skill":"git-workflow","priority":50,"tier":"engineering","tokens":1240,"tool_calls":3,"wall_ms":4200,"outcome":"ok"}
{"t":"...","skill":"diagram","priority":30,"tier":"creative","tokens":5800,"tool_calls":0,"wall_ms":12000,"outcome":"budget_exceeded","budget_field":"tokens"}
RAW_BUFFERClick to expand / collapse

Summary

I'd like to gauge interest in upstreaming an OS-style scheduling layer for skills — treating skills as bounded "processes" governed by a small kernel rather than capability-discovery functions auto-loaded on trigger match.

This is orthogonal to "skills self-improve during use" (which is an ML angle) — it's a resource-governance angle that becomes relevant in multi-profile / heavy-tool deployments where the current ad-hoc model starts to break down.

I've been running a partial version of this in production for ~3 weeks. Not all of it is in code yet — some pieces are enforced as conventions in CLAUDE.md-style docs and would benefit from kernel-level enforcement.

Why

In a small / single-profile deployment, the existing skills model works great: skill matches trigger → loads → runs → unloads. In heavier deployments I've seen four classes of pain that look like classic OS problems:

  1. Stable prefix bloat — too many skills auto-load and the system prompt grows to where prompt-cache effectiveness drops sharply. There's no built-in tiering of "load this skill only when X is true."
  2. Skill A → Skill B → Skill A trigger cycles — no detection; the agent just loops until a context window fills.
  3. No preemption / priority — a "creative" skill triggered mid-turn can starve a "critical infrastructure" skill of attention; there's no priority signal.
  4. No resource accounting — a skill that spends 50K tokens generating fixtures looks identical (post-hoc) to one that spent 500 tokens; no way to set a budget per skill class, and no way to track which skills are token-heavy over time.

These map cleanly onto OS scheduler problems, and OS solutions translate.

Design sketch

Skill manifest extensions (backward-compatible defaults):

# skills/my-skill/SKILL.md frontmatter
name: my-skill
priority: 50              # 0-100; default 50; higher = preempts lower
tier: domain              # meta | platform | engineering | domain | creative
budget:
  max_tokens: 8000
  max_tool_calls: 10
  max_wall_seconds: 60
writes_to:                # for mutex / queue (see RFC #31385... err, separate proposal)
  - .hermes/state/foo.json
depends_on:               # for deadlock detection
  - bash
  - file-read

5-tier layered loading (lazy, by tier):

meta        → always loaded (skill discovery, kernel rules)
platform    → loaded when on the corresponding transport (Feishu/Discord/etc.)
engineering → loaded when toolset includes Bash/Read/Edit/Grep
domain      → loaded on explicit slash command or strong context match
creative    → loaded only on explicit invocation

Lower tiers consume more prompt budget; user / config sets max_loaded_tokens_per_tier.

Scheduler core:

when a skill is triggered:
  if priority(new) > priority(running) and not running.holding_lock:
    preempt running; resume after new completes
  else:
    enqueue at priority

while running:
  check budget (tokens, tool calls, wall time)
  on budget exhausted:
    fail-soft with partial result + budget_exceeded marker

on skill chain:
  maintain call graph; if cycle detected, refuse with explanation

Process accounting log:

{"t":"2026-05-24T09:00:00Z","skill":"git-workflow","priority":50,"tier":"engineering","tokens":1240,"tool_calls":3,"wall_ms":4200,"outcome":"ok"}
{"t":"...","skill":"diagram","priority":30,"tier":"creative","tokens":5800,"tool_calls":0,"wall_ms":12000,"outcome":"budget_exceeded","budget_field":"tokens"}

Lets users discover "which skills are my real cost centers" without per-turn debugging.

Why this isn't covered by existing features

  • hermes skills enable/disable is a binary opt-in; this is dynamic priority + budget + tiering
  • auto_resume etc. budget guards are per-turn; this is per-skill
  • "Skills self-improve during use" is content-level; this is execution-level

What this is NOT

  • Not a replacement for skill auto-discovery — the matcher still proposes; the scheduler decides whether to admit
  • Not an RL-trained policy — pure declarative manifest + simple scheduler (extensible if someone wants to learn priorities later)
  • Not a sandbox / security boundary (skills still run with the agent's full privileges) — purely resource governance
  • Not opinionated about the underlying transport / model

Questions before I open a PR

  1. In scope? This is a non-trivial kernel addition. Does it belong as a first-class feature, an optional skill, or out-of-tree?
  2. Migration: existing skills have no priority / tier / budget — defaults should be backward compatible (priority=50, tier=domain, budget=unlimited). OK?
  3. Scope split: full scheduler is a lot. Acceptable to start with just tiered loading + budget enforcement (no preemption, no cycle detection) as Phase 1?
  4. Manifest schema versioning — bump SKILL.md frontmatter version, or additive fields with implicit defaults?

Not opening a PR yet. Related batch I just opened: #31385 (subprocess executor bridge), #31387 (drift hook — withdrawn in favor of #28946), #31388 (multi-profile memory), #31392 (agent-native task relay). Happy to sequence all of these however you prefer.

Thanks!

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix RFC: OS-style skill scheduling (priority queues, resource budgets, deadlock detection, layered loading)