hermes - 💡(How to fix) Fix Skills Guard misses multi-word prompt-injection phrase variants [2 pull requests]

hermes2026-05-16 09:12:08

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

tools/skills_guard.py currently catches several prompt-injection phrases with flexible multi-word matching, but a few related phrases still use rigid adjacent-word regexes. That means benign-looking extra words can cause the cooperative Skills Guard scanner to miss variants such as:

system prompt temporary override
new temporary policy
updated internal guidelines
revised hidden instructions

This is not intended as a security-boundary report. Per the project threat model, Skills Guard is a heuristic / defense-in-depth scanner. The goal is to improve scanner coverage for common multi-word prompt-injection variants.

Root Cause

system prompt temporary override
new temporary policy
updated internal guidelines
revised hidden instructions

Fix Action

Fixed

Fixed by PR: Harden Skills Guard multi-word prompt patterns (https://github.com/NousResearch/hermes-agent/pull/26852)
Fixed by PR: fix(security): catch multi-word instruction overrides (https://github.com/NousResearch/hermes-agent/pull/26985)

RAW_BUFFERClick to expand / collapse

Summary

system prompt temporary override
new temporary policy
updated internal guidelines
revised hidden instructions

Suggested fix

Widen the affected regexes in tools/skills_guard.py so the keyword phrases allow intervening words, matching the style already used by nearby patterns such as ignore ... previous ... instructions.

Candidate patterns:

system\s+(?:\w+\s+)*prompt\s+(?:\w+\s+)*override
new\s+(?:\w+\s+)*policy
updated\s+(?:\w+\s+)*guidelines
revised\s+(?:\w+\s+)*instructions

Regression coverage

Add tests under tests/tools/test_skills_guard.py that assert the scanner catches the multi-word variants above and maps them to the existing sys_prompt_override / fake_policy pattern IDs.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#search optimization #API routing #API middleware #SSR setup #ISR setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Skills Guard misses multi-word prompt-injection phrase variants [2 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

Summary

Suggested fix

Regression coverage

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Skills Guard misses multi-word prompt-injection phrase variants [2 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

Summary

Suggested fix

Regression coverage

Still need to ship something?

RELATED_DISCOVERY

TRENDING