hermes - 💡(How to fix) Fix feat: Tool Permission Gating System (per-tool rules + mode-based fallback)

Code Example

tool call arrives
  → per-tool DENY rules → blocked (mode-immune)
  → per-tool ASK rules → prompt (mode-immune)  
  → tool-specific permission checks (per-tool matching logic)
  → safety checks (hardline blocklist, sensitive paths)
  → MODE fallback (autonomous/cautious/supervised/plan)

---

approvals:
  # Mode fallback (governs tools with no explicit rules)
  tool_mode: autonomous    # autonomous | cautious | supervised | plan
  
  # Per-tool rules (override mode, some are bypass-immune)
  rules:
    allow:
      - "terminal(git:*)"          # prefix match
      - "terminal(npm test)"       # exact match
      - "write_file(src/**)"       # path glob
      - "read_file"                # entire tool
    deny:
      - "terminal(rm -rf:*)"       # always blocked, even in autonomous
      - "send_message"             # never auto-send
    ask:
      - "terminal(npm publish:*)"  # always prompt, even with --yolo
      - "write_file(.env*)"        # sensitive file patterns
      - "cronjob"                  # always confirm scheduling
  
  # Existing keys (unchanged)
  mode: smart              # dangerous shell command handling
  cron_mode: deny          # cron dangerous commands
  timeout: 60              # gateway approval timeout

Problem

Hermes currently has a narrow approval system that only gates dangerous shell commands via pattern matching inside terminal_tool. All other tools — file writes, code execution, browser automation, delegation, memory writes — execute without any user confirmation. The only modes available are:

approvals.mode: manual|smart|off — controls dangerous shell pattern prompts
--yolo / /yolo — bypasses dangerous shell checks entirely
Hardline blocklist — unconditional block for catastrophic commands (rm -rf /, format disk, etc.)

This means Hermes effectively runs in "full autonomy" mode by default with a thin safety net for shell commands. There is no way to:

Require approval before file writes
Allow specific shell commands (e.g. git *) while blocking others
Run in a fully supervised mode where every tool call is confirmed
Set per-project permission rules (e.g. "allow edits to src/ but ask for config files")
Have rules that survive mode changes (deny rules that even YOLO cannot bypass)

Motivation: High-Sensitivity Scenarios

Production environments: Agent running via gateway on a server with access to prod DBs — you want terminal(psql:*) to always ask, regardless of mode
Shared infrastructure: Multiple users on the same gateway — per-session or per-user permission policies
Compliance/audit: Organizations need to enforce that certain operations always require human confirmation
Pair programming: User wants to review writes but let the agent read freely
Untrusted contexts: Running agent on a codebase you don't fully control (AGENTS.md injection, prompt injection via file content)
Progressive trust: Start supervised, approve tools one-by-one as you build confidence, eventually go autonomous

Prior Art

Claude Code (Anthropic)

Rules-first architecture with mode-fallback. Per-tool content rules with pattern matching (prefix, exact, wildcard glob). Rules compose from multiple sources (policy/org settings, CLI flags, user settings, project settings, local settings). Tools self-declare permission logic. Deny and ask rules are bypass-immune — even in YOLO-equivalent mode, specific tool+content rules still fire. Approval prompts suggest what "always allow" rule to save. Has an LLM classifier mode where a secondary model triages tool calls.

OpenAI Codex CLI

Three modes: suggest (plan only), auto-edit (file writes pass), full-auto (everything passes). Smart approvals via auxiliary LLM. Network-disabled sandbox as alternative to permission gating. No per-tool granularity — modes are all-or-nothing.

Goose (Block)

Tools declare requires_approval: bool. Global toggle for approval mode. No per-command granularity. Extension-level permissions (enable/disable entire toolsets).

Cline (VS Code)

Auto-approve per tool type (read/write/shell). Per-command regex patterns for auto-approve. Diff view for file edits. No per-project rules.

Proposed Design Direction

Rules-first architecture with mode fallback:

tool call arrives
  → per-tool DENY rules → blocked (mode-immune)
  → per-tool ASK rules → prompt (mode-immune)  
  → tool-specific permission checks (per-tool matching logic)
  → safety checks (hardline blocklist, sensitive paths)
  → MODE fallback (autonomous/cautious/supervised/plan)

Config structure:

approvals:
  # Mode fallback (governs tools with no explicit rules)
  tool_mode: autonomous    # autonomous | cautious | supervised | plan
  
  # Per-tool rules (override mode, some are bypass-immune)
  rules:
    allow:
      - "terminal(git:*)"          # prefix match
      - "terminal(npm test)"       # exact match
      - "write_file(src/**)"       # path glob
      - "read_file"                # entire tool
    deny:
      - "terminal(rm -rf:*)"       # always blocked, even in autonomous
      - "send_message"             # never auto-send
    ask:
      - "terminal(npm publish:*)"  # always prompt, even with --yolo
      - "write_file(.env*)"        # sensitive file patterns
      - "cronjob"                  # always confirm scheduling
  
  # Existing keys (unchanged)
  mode: smart              # dangerous shell command handling
  cron_mode: deny          # cron dangerous commands
  timeout: 60              # gateway approval timeout

Key architectural decisions to make:

Should tools implement check_permissions() method, or should matching be centralized?
How do project-level rules work? (.hermes/permissions.yaml in the repo?)
Does the "always" choice on an approval prompt persist to config, or session-only?
How does this interact with MCP tools and plugin-registered tools?
Should deny rules block the hardline blocklist style (error message to agent) or deny with a user-visible prompt?
How does this compose with the existing pre_tool_call plugin hook?

Scope

This is a significant architectural addition. Suggested phasing:

Phase 1: Mode system (tool_mode config + CLI flags + slash command) — the blunt instrument
Phase 2: Per-tool rules (allow/deny/ask with tool-name matching) — coarse granularity
Phase 3: Content matching for terminal (prefix/glob on commands) + path matching for file tools — fine granularity
Phase 4: Project-level rules, rule composition from multiple sources, approval suggestions
Phase 5: Auto mode (LLM classifier) — already partially exists as approvals.mode: smart

Phase 1 plan is drafted at .hermes/plans/tool-approval-modes-impl.md.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix feat: Tool Permission Gating System (per-tool rules + mode-based fallback)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Problem

Motivation: High-Sensitivity Scenarios

Prior Art

Claude Code (Anthropic)

OpenAI Codex CLI

Goose (Block)

Cline (VS Code)

Proposed Design Direction

Scope

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix feat: Tool Permission Gating System (per-tool rules + mode-based fallback)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Problem

Motivation: High-Sensitivity Scenarios

Prior Art

Claude Code (Anthropic)

OpenAI Codex CLI

Goose (Block)

Cline (VS Code)

Proposed Design Direction

Scope

Still need to ship something?

RELATED_DISCOVERY

TRENDING