hermes - 💡(How to fix) Fix No harness-level defense against prompt injection in tool outputs [1 participants]

hermes2026-05-02 21:05:22

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#18981•Fetched 2026-05-03 04:53:07

View on GitHub

Comments

Participants

Timeline

Reactions

Author

YdocYNj

Participants

YdocYNj

Timeline (top)

labeled ×4

RAW_BUFFERClick to expand / collapse

Problem

When Hermes Agent calls tools that return external data — browser_navigate, terminal(curl ...), read_file on third-party repos, MCP tool outputs, search results — there is no harness-level mechanism to validate or guard that data before it enters the model's context. The model sees raw tool output interleaved with its conversation and system prompt, with nothing preventing injected instructions in that output from being treated as commands.

The BridgeWard skill teaches the model a skeptical-reading discipline, but it is a behavioral defense only: the model must remember to load the skill before processing external data, which fails in practice (especially under load, distraction, or when speed is prioritized).

OWASP LLM01 (2025) identifies this as a fundamental architectural gap: prompt injection defenses that rely on model compliance are not defenses at all.

Existing infrastructure that could close this gap

The transform_tool_result plugin hook (model_tools.py line 753) already fires on every tool result before it is appended to the conversation context. It allows a plugin to replace the result string entirely. The gateway/builtin_hooks/ directory exists and is currently empty.

Desired outcome

A way to mark certain tools (or all tools returning network/external data) so that their output is automatically annotated or wrapped with a security preamble before the model sees it — without requiring the model to "remember" to load a skill first. This would provide defense in depth: behavioral (skill) + architectural (harness injection).

I'm not prescribing the implementation — just surfacing that the gap exists and that the transform_tool_result hook + builtin_hooks/ directory look like natural fit points.

extent analysis

TL;DR

Implement a plugin using the transform_tool_result hook to automatically annotate or wrap tool output with a security preamble before it enters the model's context.

Guidance

Identify tools that return external data and determine the appropriate security preamble to apply.
Develop a plugin to utilize the transform_tool_result hook, replacing the result string with the annotated or wrapped output.
Consider implementing a default annotation for all tools returning network/external data to provide a baseline level of defense.
Test the plugin to ensure it correctly annotates tool output without interfering with model functionality.

Example

# Example plugin using transform_tool_result hook
def transform_tool_result(result, tool_name):
    # Apply security preamble to tool output
    secure_preamble = "[EXTERNAL DATA] "
    return secure_preamble + result

Notes

The implementation details of the plugin and security preamble will depend on the specific requirements and constraints of the system. This solution assumes that the transform_tool_result hook is sufficient to address the identified gap.

Recommendation

Apply workaround by implementing a plugin using the transform_tool_result hook, as this provides a targeted solution to the identified architectural gap without requiring changes to the model or existing skills.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#memory management #API rate limit #retriever error #indexing error #inference speed

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix No harness-level defense against prompt injection in tool outputs [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Problem

Existing infrastructure that could close this gap

Desired outcome

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix No harness-level defense against prompt injection in tool outputs [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Problem

Existing infrastructure that could close this gap

Desired outcome

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING