hermes - 💡(How to fix) Fix Feature: User-approval gate for outbound communication tools (send_message, email, etc.)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  1. Tool enforcement: The tool handler checks user_approval_token before executing. If missing or invalid, it returns a hard error — not a warning, not a prompt — the send simply fails. Even a simpler version — the tool returns an error message by default saying "This action requires user approval. Ask the user to confirm." — and only proceeds after the LLM calls clarify() and gets back a confirmation — would be an improvement over pure prompt text.
RAW_BUFFERClick to expand / collapse

Feature Description

Add a hard enforcement mechanism (not prompt-based) that requires explicit user approval before any outbound communication tool can send messages to external contacts.

Currently, safety rules for outbound communication ("don't send API keys, tokens, emails to third parties without permission") are enforced only via system prompt text. As recent experience shows, this is fragile — text-based rules can be overlooked or forgotten by the LLM during complex tasks.

Motivation

Agents can have powerful outbound tools: send_message (Telegram, Discord, Slack, etc.), email sending, HTTP POST to external services. A misstep here is irreversible — once an API key or private message is sent, it's gone.

Prompt-level rules are necessary but not sufficient. We need a code-level gate that blocks outbound messages unless the user has explicitly approved.

Proposed Solution

Add an approval_required flag to outbound communication tools. When set, the tool refuses to execute unless it receives a cryptographically valid approval token that only the user (via the gateway) can produce.

Possible implementation sketch:

  1. New tool parameter: user_approval_token: str | None on send_message, email tools, etc.
  2. Gateway generates approval tokens: When the user confirms an outbound action (e.g. via a /approve-like mechanism or inline button), the gateway issues a short-lived signed token.
  3. Tool enforcement: The tool handler checks user_approval_token before executing. If missing or invalid, it returns a hard error — not a warning, not a prompt — the send simply fails.
  4. Configurable per-instance: outbound.require_approval: true in config.yaml.

Alternative: simpler stopgap

Even a simpler version — the tool returns an error message by default saying "This action requires user approval. Ask the user to confirm." — and only proceeds after the LLM calls clarify() and gets back a confirmation — would be an improvement over pure prompt text.

Alternatives Considered

  • Just writing stronger prompts: Already tried. Doesn't hold — prompts can be missed.
  • Environment variable gate: Could work but is too blunt (all-or-nothing, can't be per-message).

Prior Art

  • hermes config set approvals.mode already exists for shell commands — this is a similar concept applied to outbound messaging.
  • Claude Code has permission_mode for file writes and network calls.

Impact

  • Safety: Prevents irreversible outbound leaks even when the LLM makes mistakes.
  • UX: Adds one confirmation step for outbound messages. Acceptable trade-off given the risk.
  • Implementation: Touches tools/send_message.py, email tools, and the gateway approval flow.

Related

  • Security config: security.redact_secrets (already exists)
  • Command approvals: approvals.mode (already exists)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Feature: User-approval gate for outbound communication tools (send_message, email, etc.)