hermes - 💡(How to fix) Fix Bug: extract_media() false-positives on example paths in quoted text / code blocks

StepCodex · 2026-05-31T04:44:03Z

[hermes] Problem extract media scans the full response text with a single regex pass MEDIA TAG CLEANUP RE without distinguishing live delivery tags from exampl… ## Fix / Workaround ## Problem `extract_media()` scans the full response text with a single regex pass (`MEDIA_TAG_CLEANUP_RE`) without distinguishing **live delivery tags** from **example paths mentioned in prose**. This causes false positives when: 1. A skill description, error message, or doc string contains a literal example like inside a code block or quote 2. The agent explains *how* to use the MEDIA tag in its reply (e.g. "include `MEDIA:/path/to/file` in your response") 3. A tool returns output that happens to contain a path matching the regex **Effect:** The matching text is stripped from the user-visible response and the path is added to the media list. `validate_media_delivery_path` then either rejects it (silent drop) or — if the path happens to exist — delivers an unintended file. ## Minimal Reproduction Ask the agent to explain the MEDIA delivery syntax. Its reply will likely contain something like: > To send an image, include in your response. The backtick-wrapped example matches `MEDIA_TAG_CLEANUP_RE`, gets stripped from the text, and an attempt is made to deliver `/path/to/image.jpg`. ## Root Cause `extract_media()` in `gateway/platforms/base.py` (~line 2577): for match in media_pattern.finditer(content): path = match.group("path").strip() ... media.append((os.path.expanduser(path), has_voice_tag)) The scan is context-blind — it does not skip: - Fenced code blocks (``` ... ```) - Inline code spans (`MEDIA:...`) - Blockquotes (`> ...`) - Tool output embedded in the response ## Proposed Fix Direction Before running `MEDIA_TAG_CLEANUP_RE`, mask content inside fenced code blocks, inline code spans, and blockquotes (replace protected spans with equal-length whitespace to preserve match offsets). This keeps the cleanup substitution correct while eliminating false positives. Happy to submit a PR if the direction looks right — we have a working patch in our deployment. ## Environment - macOS 15, launchd-managed gateway - Feishu platform adapter - Triggered by: skills that document MEDIA syntax, tool error messages containing file paths ## Problem `extract_media()` scans the full response text with a single regex pass (`MEDIA_TAG_CLEANUP_RE`) without distinguishing **live delivery tags** from **example paths mentioned in prose**. This causes false positives when: 1. A skill description, error message, or doc string contains a literal example like inside a code block or quote 2. The agent explains *how* to use the MEDIA tag in its reply (e.g. "include `MEDIA:/path/to/file` in your response") 3. A tool returns output that happens to contain a path matching the regex **Effect:** The matching text is stripped from the user-visible response and the path is added to the media list. `validate_media_delivery_path` then either rejects it (silent drop) or — if the path happens to exist — delivers an unintended file. ## Minimal Reproduction Ask the agent to explain the MEDIA delivery syntax. Its reply will likely contain something like: > To send an image, include in your response. The backtick-wrapped example matches `MEDIA_TAG_CLEANUP_RE`, gets stripped from the text, and an attempt is made to deliver `/path/to/image.jpg`. ## Root Cause `extract_media()` in `gateway/platforms/base.py` (~line 2577): for match in media_pattern.finditer(content): path = match.group("path").strip() ... media.append((os.path.expanduser(path), has_voice_tag)) The scan is context-blind — it does not skip: - Fenced code blocks (``` ... ```) - Inline code spans (`MEDIA:...`) - Blockquotes (`> ...`) - Tool output embedded in the response ## Proposed Fix Direction Before running `MEDIA_TAG_CLEANUP_RE`, mask content inside fenced code blocks, inline code spans, and blockquotes (replace protected spans with equal-length whitespace to preserve match offsets). This keeps the cleanup substitution correct while eliminating false positives. Happy to submit a PR if the direction looks right — we have a working patch in our deployment. ## Environment - macOS 15, launchd-managed gateway - Feishu platform adapter - Triggered by: skills that document MEDIA syntax, tool error messages containing file paths

hermes2026-05-31 04:44:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

A skill description, error message, or doc string contains a literal

Triggered by: skills that document MEDIA syntax, tool error messages

Root Cause

extract_media() in gateway/platforms/base.py (~line 2577): for match in media_pattern.finditer(content): path = match.group("path").strip() ... media.append((os.path.expanduser(path), has_voice_tag)) The scan is context-blind — it does not skip:

Fenced code blocks (...)
Inline code spans (MEDIA:...)
Blockquotes (> ...)
Tool output embedded in the response

Fix Action

Fix / Workaround

Problem

extract_media() scans the full response text with a single regex pass (MEDIA_TAG_CLEANUP_RE) without distinguishing live delivery tags from example paths mentioned in prose. This causes false positives when:

A skill description, error message, or doc string contains a literal example like inside a code block or quote
The agent explains how to use the MEDIA tag in its reply (e.g. "include MEDIA:/path/to/file in your response")
A tool returns output that happens to contain a path matching the regex Effect: The matching text is stripped from the user-visible response and the path is added to the media list. validate_media_delivery_path then either rejects it (silent drop) or — if the path happens to exist — delivers an unintended file.

Minimal Reproduction

Ask the agent to explain the MEDIA delivery syntax. Its reply will likely contain something like:

To send an image, include in your response. The backtick-wrapped example matches MEDIA_TAG_CLEANUP_RE, gets stripped from the text, and an attempt is made to deliver /path/to/image.jpg.

Root Cause

Fenced code blocks (...)
Inline code spans (MEDIA:...)
Blockquotes (> ...)
Tool output embedded in the response

Proposed Fix Direction

Before running MEDIA_TAG_CLEANUP_RE, mask content inside fenced code blocks, inline code spans, and blockquotes (replace protected spans with equal-length whitespace to preserve match offsets). This keeps the cleanup substitution correct while eliminating false positives. Happy to submit a PR if the direction looks right — we have a working patch in our deployment.

Environment

macOS 15, launchd-managed gateway
Feishu platform adapter
Triggered by: skills that document MEDIA syntax, tool error messages containing file paths

RAW_BUFFERClick to expand / collapse

Problem

A skill description, error message, or doc string contains a literal example like inside a code block or quote
The agent explains how to use the MEDIA tag in its reply (e.g. "include MEDIA:/path/to/file in your response")
A tool returns output that happens to contain a path matching the regex Effect: The matching text is stripped from the user-visible response and the path is added to the media list. validate_media_delivery_path then either rejects it (silent drop) or — if the path happens to exist — delivers an unintended file.

Minimal Reproduction

Ask the agent to explain the MEDIA delivery syntax. Its reply will likely contain something like:

To send an image, include in your response. The backtick-wrapped example matches MEDIA_TAG_CLEANUP_RE, gets stripped from the text, and an attempt is made to deliver /path/to/image.jpg.

Root Cause

Fenced code blocks (...)
Inline code spans (MEDIA:...)
Blockquotes (> ...)
Tool output embedded in the response

Proposed Fix Direction

Environment

macOS 15, launchd-managed gateway
Feishu platform adapter
Triggered by: skills that document MEDIA syntax, tool error messages containing file paths

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Bug: extract_media() false-positives on example paths in quoted text / code blocks

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Problem

Minimal Reproduction

Root Cause

Proposed Fix Direction

Environment

Problem

Minimal Reproduction

Root Cause

Proposed Fix Direction

Environment

Still need to ship something?

TRENDING