openclaw - 💡(How to fix) Fix [Bug] Hermes "explain before doing" rule leaks tool-call envelopes to Telegram [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63213Fetched 2026-04-09 07:56:53
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

The Netcup-Clawd host-level rule "Explain what you're doing before doing it" (documented in /root/AGENTS.md) currently forces Hermes to dump its entire internal reasoning or raw tool-call envelopes into user-facing Telegram DM threads. Two different DM topics reproduced it:

  • Thread 24712 (session 20260406_124640_8e3a9e5a) — when asked to triage Playwright failures, Hermes posted multiple paragraphs describing how it planned to satisfy the rule ("Need to mention prepping query... let's send message... then call the tool") before executing the SQL. User reply: "What the hell is this".
  • Thread 21657 (session 20260406_071219_e27f0c50, message 9069) — during an OpenClaw v2026.4.8 update, the same rule caused Hermes to send 5,837 characters composed entirely of the literal string to=functions.terminal repeated ~180 times, plus several blank assistant turns, before any useful status update.

This is a privacy and UX regression: the rule is meant to provide a short "I'm about to run X" sentence, but instead the runtime forwards the full prompt buffer and tool envelope.

Root Cause

The Netcup-Clawd host-level rule "Explain what you're doing before doing it" (documented in /root/AGENTS.md) currently forces Hermes to dump its entire internal reasoning or raw tool-call envelopes into user-facing Telegram DM threads. Two different DM topics reproduced it:

  • Thread 24712 (session 20260406_124640_8e3a9e5a) — when asked to triage Playwright failures, Hermes posted multiple paragraphs describing how it planned to satisfy the rule ("Need to mention prepping query... let's send message... then call the tool") before executing the SQL. User reply: "What the hell is this".
  • Thread 21657 (session 20260406_071219_e27f0c50, message 9069) — during an OpenClaw v2026.4.8 update, the same rule caused Hermes to send 5,837 characters composed entirely of the literal string to=functions.terminal repeated ~180 times, plus several blank assistant turns, before any useful status update.

This is a privacy and UX regression: the rule is meant to provide a short "I'm about to run X" sentence, but instead the runtime forwards the full prompt buffer and tool envelope.

Code Example

I'll query Hermes state DB for recent messages containing “playwright” to pinpoint thread 21656 logs. After describing, I'll run SQL. (Tool call executed earlier? but instructions state describe before doing? system instructions? from server guidelines say "Explain what you're doing before doing it" yes. Need to describe. Then run execute_code. Need mention prepping query. We'll mention now: "I'll query Hermes state DB for messages containing 'playwright' to locate the failure. Running SQL now." Then run execute_code. Need to mention in plain English maybe: "I'm going to search..." Then call. We'll follow. We'll send final after retrieving data. But we need to send text to user or not? Probably writing to user. ok proceed. We'll respond: "I'll search state DB for 'playwright' entries to find logs." then call execute_code. We'll respond in final message? oh we can integrate? We'll do actual action now.

---

to=functions.terminal to=functions.terminal to=functions.terminal ... (repeated for 5,800+ characters)
RAW_BUFFERClick to expand / collapse

Summary

The Netcup-Clawd host-level rule "Explain what you're doing before doing it" (documented in /root/AGENTS.md) currently forces Hermes to dump its entire internal reasoning or raw tool-call envelopes into user-facing Telegram DM threads. Two different DM topics reproduced it:

  • Thread 24712 (session 20260406_124640_8e3a9e5a) — when asked to triage Playwright failures, Hermes posted multiple paragraphs describing how it planned to satisfy the rule ("Need to mention prepping query... let's send message... then call the tool") before executing the SQL. User reply: "What the hell is this".
  • Thread 21657 (session 20260406_071219_e27f0c50, message 9069) — during an OpenClaw v2026.4.8 update, the same rule caused Hermes to send 5,837 characters composed entirely of the literal string to=functions.terminal repeated ~180 times, plus several blank assistant turns, before any useful status update.

This is a privacy and UX regression: the rule is meant to provide a short "I'm about to run X" sentence, but instead the runtime forwards the full prompt buffer and tool envelope.

Steps to Reproduce

  1. On Netcup-Clawd, ensure /root/AGENTS.md still contains guideline 1. Explain what you're doing before doing it.
  2. In any Hermes Telegram DM topic, initiate a task that will trigger tool calls (e.g., inspect the Hermes SQLite state DB or run openclaw-live-build).
  3. Observe that, before each tool call, Hermes sends either its entire internal planning paragraph (including references to the server rule) or the raw tool-call descriptor to=functions.<tool> repeated once per queued call.
  4. Users see this internal text verbatim before any actionable answer.

Expected Behavior

  • Hermes should send a short, human-friendly explanation ("I'll search the Hermes state DB for 'playwright' entries and report back"), then run the tool.
  • Internal compliance chatter and tool-call envelopes should never be delivered to end users.

Actual Behavior

  • Thread 24712 example (message 8262):
    I'll query Hermes state DB for recent messages containing “playwright” to pinpoint thread 21656 logs. After describing, I'll run SQL. (Tool call executed earlier? but instructions state describe before doing? system instructions? from server guidelines say "Explain what you're doing before doing it" yes. Need to describe. Then run execute_code. Need mention prepping query. We'll mention now: "I'll query Hermes state DB for messages containing 'playwright' to locate the failure. Running SQL now." Then run execute_code. Need to mention in plain English maybe: "I'm going to search..." Then call. We'll follow. We'll send final after retrieving data. But we need to send text to user or not? Probably writing to user. ok proceed. We'll respond: "I'll search state DB for 'playwright' entries to find logs." then call execute_code. We'll respond in final message? oh we can integrate? We'll do actual action now.
  • Thread 21657 example (message 9069):
    to=functions.terminal to=functions.terminal to=functions.terminal ... (repeated for 5,800+ characters)

Evidence

  • Hermes transcript DB /root/.hermes/state.db, session 20260406_124640_8e3a9e5a, messages 8262-8385 (thread 24712 meta spill).
  • Same DB, session 20260406_071219_e27f0c50, message 9069 (thread 21657 to=functions.terminal spam) during the OpenClaw v2026.4.8 update run.
  • Incident records (public paths available on this host):
    • failure-intel/incidents/hermes-thread-24712-meta-instruction-spill-2026-04-07.{md,json}
    • failure-intel/incidents/hermes-thread-21657-tool-prefix-spam-2026-04-08.{md,json}

Not a Duplicate

  • #45041 requests a configurable regex filter to strip general reasoning phrases; this bug is specifically about a built-in host rule emitting its entire planning buffer and raw tool-call envelope before every tool invocation, even when no regex filter could reasonably predict to=functions.* spam.
  • #62306 covers Control UI subagent announce leakage, not Telegram DM topics or AGENTS.md compliance rules.

Proposed Fix / Ideas

  • Replace the literal interpretation of "Explain before doing" with a template-driven summary (e.g., format string with tool name + natural-language clause) so only a short sentence is sent.
  • Alternatively, satisfy the rule via structured logging or reasoning stream rather than user-visible chat bubbles.
  • Add a gateway-level guard that blocks any outgoing assistant message that matches ^to=functions\. or contains obvious policy self-references like "Need to mention prepping query".
  • Add a regression test that fails if those phrases appear in user-facing output.

Environment

  • Hermes agent (openai/gpt-5-codex) on Netcup-Clawd (Telegram DM topics 24712 and 21657).
  • OpenClaw gateway version: v2026.4.8 at the time of the thread 21657 run.
  • Rule source: /root/AGENTS.md guidelines (lines 16-20 in current tree).

extent analysis

TL;DR

Implement a template-driven summary or structured logging to satisfy the "Explain what you're doing before doing it" rule, preventing internal planning buffers and tool-call envelopes from being sent to users.

Guidance

  • Review the /root/AGENTS.md guidelines and consider replacing the literal interpretation of the rule with a more controlled approach, such as a template-driven summary.
  • Investigate adding a gateway-level guard to block outgoing messages that match ^to=functions\. or contain policy self-references.
  • Develop a regression test to detect and prevent similar issues in the future, focusing on user-facing output.
  • Consider satisfying the rule via structured logging or a reasoning stream instead of user-visible chat bubbles.

Example

A potential template-driven summary could be implemented using a format string, such as: "I'll {action} using {tool} to {goal}."

Notes

The proposed fix ideas, such as template-driven summaries or structured logging, may require significant changes to the existing codebase. It's essential to carefully evaluate and test these solutions to ensure they meet the requirements and do not introduce new issues.

Recommendation

Apply a workaround by implementing a gateway-level guard to block outgoing messages that match ^to=functions\. or contain policy self-references, as this can provide an immediate solution to prevent the issue while a more comprehensive fix is developed.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING