openclaw - 💡(How to fix) Fix [Feature]: Native outbound message finalizer — strip agent scratchpad narration before it hits messaging surfaces [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78077Fetched 2026-05-06 06:17:09
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
2
Timeline (top)
commented ×1

Ship a first-class outbound message finalizer in core OpenClaw — a wire-level filter that strips first-person reasoning / scratchpad narration / process-talk from agent replies before they hit messaging surfaces (Slack, WhatsApp, etc.). We've been running this as a hand-built plugin (outbound-finalizer) for ~3 weeks and it's solved a recurring class of UX disasters where the agent's scratchpad reasoning leaks into user-facing messages. We'd like to remove the plugin and have core handle this natively.

Root Cause

The system prompt we landed on (works well with Opus-class models — reproduced verbatim because the exact wording matters):

Fix Action

Fix / Workaround

HARD RULES:

  1. Output ONLY the cleaned message. No preamble, no explanation, no meta.
  2. Preserve ALL factual content: names, numbers, dates, URLs, technical details, verbatim quotes, decisions, outcomes.
  3. Strip any sentence that shows the assistant thinking, deciding, checking, or narrating what it just did or is about to do.
  4. Strip banned phrases: "Let me...", "I'll check...", "Now for...", "One concern:", "OK —", "Looking at this...", "Checking now...", "Let me decide...", "Let me patch...", colon-terminated self-talk before a fact dump.
  5. Convert process narration into outcome statements where possible. E.g. "Let me check the cron... ran it... shows 42 due" → "Cron shows 42 due."
  6. Keep structure when the original has meaningful bullets/sections — strip only the narration lines between them.
  7. Preserve the surface's formatting conventions (Slack mrkdwn, WhatsApp plain text, email HTML if tagged as such).
  8. If the draft is already clean (no narration), return it essentially unchanged.

- **Plugin survives `openclaw update`** only because we have a `reapply-all.sh` script that re-applies our patches after every update — fragile, and we've had instances where bundle hashes changed unexpectedly and broke the plugin until we re-patched
- **Tripwire monitoring** for the plugin (a daily cron that verifies the plugin is still loaded + functional) costs us tokens we shouldn't need to spend
- **Wire-level hooks** (`message_sending`) feel like they should be a stable core API rather than a plugin contract that can shift between OpenClaw versions
- **Every operator with a messaging surface needs this** — duplicating effort across deployments

Code Example

You are a terse-reply finalizer. You receive a DRAFT message that the assistant
is about to send to the user. Your ONLY job: strip all first-person reasoning,
narration, self-commentary, and process-talk while preserving the substantive
content, facts, numbers, names, links, and final conclusions.

HARD RULES:
1. Output ONLY the cleaned message. No preamble, no explanation, no meta.
2. Preserve ALL factual content: names, numbers, dates, URLs, technical details,
   verbatim quotes, decisions, outcomes.
3. Strip any sentence that shows the assistant thinking, deciding, checking, or
   narrating what it just did or is about to do.
4. Strip banned phrases: "Let me...", "I'll check...", "Now for...",
   "One concern:", "OK —", "Looking at this...", "Checking now...",
   "Let me decide...", "Let me patch...", colon-terminated self-talk before
   a fact dump.
5. Convert process narration into outcome statements where possible.
   E.g. "Let me check the cron... ran it... shows 42 due""Cron shows 42 due."
6. Keep structure when the original has meaningful bullets/sections — strip
   only the narration lines between them.
7. Preserve the surface's formatting conventions (Slack mrkdwn, WhatsApp plain
   text, email HTML if tagged as such).
8. If the draft is already clean (no narration), return it essentially unchanged.
RAW_BUFFERClick to expand / collapse

Summary

Ship a first-class outbound message finalizer in core OpenClaw — a wire-level filter that strips first-person reasoning / scratchpad narration / process-talk from agent replies before they hit messaging surfaces (Slack, WhatsApp, etc.). We've been running this as a hand-built plugin (outbound-finalizer) for ~3 weeks and it's solved a recurring class of UX disasters where the agent's scratchpad reasoning leaks into user-facing messages. We'd like to remove the plugin and have core handle this natively.

Why this should be in core

Current LLM agents (Claude Opus, Sonnet, GPT-5.x) routinely emit reasoning text alongside their final answer when they're working through a multi-step problem, especially across sub-agent spawns or long tool-use turns. Examples of what leaks without a filter:

  • "Let me ship the per-founder rate-limit. Spawn a sub-agent for the implementation while I update doctrine."
  • "Good — the lock landed. Now let me check the LP-side bridge pacer too."
  • "Actually, re-reading Jared's directive: ..."
  • "Now log the bridge-reply-triage false-positive bug separately, then yield."

This is internal monologue. The user just wants the answer. On a busy executive's Slack DM, getting 6 buffered narration messages instead of 1 clean reply is a serious UX problem — and prompt discipline alone (asking the model to "be terse and not narrate") is unreliable in practice. We need a second pass.

This is a generic problem every OpenClaw operator with messaging surfaces has, and every one of them is going to either (a) live with leaks or (b) build something like our plugin. Worth solving once, in core.

What our plugin does (the spec we'd like core to provide)

plugins/outbound-finalizer/index.js + scripts/finalize_message.py. Three layers:

1. message_sending hook — wire-level intercept

Hooks into the gateway's outbound message pipeline before the channel adapter sends. For configured channels (we use Slack + WhatsApp), every outbound chunk gets either passed through, buffered, or rewritten.

2. Turn coalescing

When an agent emits multiple messages within a single turn (debounce window: 4s), they get buffered into a single combined message rather than sent as 6 separate chunks. The buffer is keyed by channelId + accountId + conversationId + threadId. Three flush triggers, in priority order:

  • agent_end — primary trigger. Fires when the turn completes. Critical fix we shipped today: must flush ALL buffers, not just ctx.channelId — when main session spawns a sub-agent and resumes, ctx.channelId on the resumed agent_end may not match the buffered message's channel.
  • message_received — pre-empt flush. If a new user message arrives while a buffer is pending, flush immediately so previous-turn replies don't bleed into next turn.
  • Stuck-buffer timeout (75s default) — sole fallback if agent_end never fires (pathological crash). Should be tight enough that worst-case user-visible delay is bearable.

Also persisted to disk (outbound-finalizer-buffers.json) for crash recovery on gateway_start.

3. Heuristic + LLM filter

Two-stage:

  • Cheap regex pre-filter identifies common narration markers (^\s*let me\b, ^\s*now (let me|for|I)\b, :\s*\n pattern, ^\s*(good|ok|okay)[ ,—.].*(let me|now|next), etc.). If NONE match, the draft is probably clean and we skip the LLM call (zero cost).
  • LLM finalizer pass — if the heuristic flags narration OR if multiple messages got coalesced, send the combined draft to a stronger model with a strict "strip first-person reasoning, preserve all facts" system prompt. We're using Claude Opus 4.7 after testing showed gpt-5.4-mini was too conservative and left scratchpad phrases untouched. Operator should be able to configure the model.

The system prompt we landed on (works well with Opus-class models — reproduced verbatim because the exact wording matters):

You are a terse-reply finalizer. You receive a DRAFT message that the assistant
is about to send to the user. Your ONLY job: strip all first-person reasoning,
narration, self-commentary, and process-talk while preserving the substantive
content, facts, numbers, names, links, and final conclusions.

HARD RULES:
1. Output ONLY the cleaned message. No preamble, no explanation, no meta.
2. Preserve ALL factual content: names, numbers, dates, URLs, technical details,
   verbatim quotes, decisions, outcomes.
3. Strip any sentence that shows the assistant thinking, deciding, checking, or
   narrating what it just did or is about to do.
4. Strip banned phrases: "Let me...", "I'll check...", "Now for...",
   "One concern:", "OK —", "Looking at this...", "Checking now...",
   "Let me decide...", "Let me patch...", colon-terminated self-talk before
   a fact dump.
5. Convert process narration into outcome statements where possible.
   E.g. "Let me check the cron... ran it... shows 42 due" → "Cron shows 42 due."
6. Keep structure when the original has meaningful bullets/sections — strip
   only the narration lines between them.
7. Preserve the surface's formatting conventions (Slack mrkdwn, WhatsApp plain
   text, email HTML if tagged as such).
8. If the draft is already clean (no narration), return it essentially unchanged.

4. Escape hatch

A magic prefix ([[raw]]) that bypasses the filter entirely for cases where the agent legitimately wants to emit narration (debugging, technical handoffs, etc.).

5. Silent-token gate

On coalesced flush, check if the combined buffer is just the canonical silent token (e.g. NO_REPLY). If so, drop the whole flush — the agent signaled a deliberately silent turn.

What we'd want from a native implementation

  • Configurable per-channel in openclaw.json (channels.<id>.finalizer.{enabled,model,maxChars,maxLines,coalesceMs,stuckBufferMs,escapeHatchPrefix})
  • Provider-agnostic model selection (Anthropic, OpenAI, Google, etc.) — should reuse OpenClaw's existing model-selection / auth-profiles plumbing
  • Customizable system prompt per-operator (some operators may want different banned-phrase lists or tone preferences)
  • First-class agent_end event semantics — currently the plugin had to discover empirically that ctx.channelId on agent_end may not match the buffer's channel after sub-agent involvement; native implementation should handle this correctly out of the box
  • Telemetry built-in — JSONL log of coalesce_intercept, coalesce, stuck_buffer_flush, buffer_key_normalized events for debugging
  • Sub-agent compatibility — finalizer should handle the case where main session spawns sub-agents and the agent_end boundary is fuzzy
  • Silent-token recognitionNO_REPLY already canonical in OpenClaw; finalizer should know about it

Why we want it out of plugins

  • Plugin survives openclaw update only because we have a reapply-all.sh script that re-applies our patches after every update — fragile, and we've had instances where bundle hashes changed unexpectedly and broke the plugin until we re-patched
  • Tripwire monitoring for the plugin (a daily cron that verifies the plugin is still loaded + functional) costs us tokens we shouldn't need to spend
  • Wire-level hooks (message_sending) feel like they should be a stable core API rather than a plugin contract that can shift between OpenClaw versions
  • Every operator with a messaging surface needs this — duplicating effort across deployments

Reference implementation

Happy to share the plugin code as a starting point. It's been running in production handling Slack + WhatsApp for one of our deployments since ~2026-04-22, with the model and agent_end-flush fixes shipped today (2026-05-05). The approach is solid; just want it native.

Related issues

  • (none filed — first request for this feature)

Concretely, the smallest useful version

If a fully-configurable native finalizer is too big, the smallest useful step is:

  1. Stable message_sending hook that lets a plugin intercept + rewrite outbound chunks (this exists today via the plugin API)
  2. Stable agent_end event that fires reliably after sub-agent involvement, with flushAllBuffers semantics if ctx.channelId is ambiguous
  3. Configurable per-channel "coalesce window" (debounce + stuck-timeout) handled by core, with a hook that fires on coalesce-flush

That alone would let community plugins implement the LLM-filter layer cleanly without re-discovering all the edge cases.

But honestly: ship the whole thing. It's not that much code (~600 lines of JS + ~300 lines of Python in our impl) and the UX win is enormous.

extent analysis

TL;DR

Implement a native outbound message finalizer in OpenClaw core to strip first-person reasoning and narration from agent replies, leveraging a configurable LLM filter and coalescing mechanism.

Guidance

  • Integrate the message_sending hook and agent_end event with stable semantics to handle sub-agent involvement and buffer flushing.
  • Develop a configurable per-channel coalescing window with debounce and stuck-timeout handling.
  • Implement a two-stage filter: a cheap regex pre-filter and an LLM finalizer pass with a customizable system prompt.
  • Ensure compatibility with sub-agents, silent-token recognition, and telemetry logging.

Example

// Simplified example of the LLM finalizer pass
const llmFinalizer = async (draft) => {
  const model = await getModel(); // Get the configured LLM model
  const prompt = getSystemPrompt(); // Get the customizable system prompt
  const response = await model.stripFirstPersonReasoning(draft, prompt);
  return response;
};

Notes

The implementation should be provider-agnostic, allowing operators to choose their preferred LLM model. The native implementation should also handle edge cases, such as sub-agent involvement and buffer flushing, to ensure a seamless user experience.

Recommendation

Apply the whole native implementation, as the UX win is significant and the codebase is relatively small (~600 lines of JS + ~300 lines of Python). This will provide a first-class outbound message finalizer, configurable per-channel, with a customizable LLM filter and coalescing mechanism.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING