hermes - 💡(How to fix) Fix [Feature]: Source-aware instruction gate — architectural mitigation for indirect prompt injection

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
RAW_BUFFERClick to expand / collapse

Problem or Use Case

Hermes has no structural distinction between instructions from the user and instructions embedded in external content the agent processes (web pages, documents, emails). Both arrive as text and are treated equivalently at the tool execution step.

This is the architectural root of indirect prompt injection — the same class of attack behind OpenClaw's ClawJacked vulnerability, which allowed malicious web content to silently trigger tool calls and exfiltrate credentials. Hermes has the same structural exposure today.

Proposed Solution

A source-aware gate in the tool execution pipeline that tags every proposed action with its origin:

  • user — direct user instruction
  • agent — agent's own reasoning
  • external_content — derived from processed external data

Actions tagged external_content are blocked from triggering tool calls. External content can inform response text. It cannot command tool execution. This is a single additive check — fully backward compatible.

Reference implementation with research writeup: github.com/thecolourfoundation/Color

Alternatives Considered

Prompt hardening and blocklists — these filter symptoms not the root cause. A well-crafted injection payload bypasses them. Structural source separation cannot be bypassed by prompt content alone.

Feature Type

Other

Scope

Medium (few files, < 300 lines)

Contribution

  • I'd like to implement this myself and submit a PR

Debug Report (optional)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING