openclaw - 💡(How to fix) Fix Add outbound guardrails and verified delivery semantics for human chat channels

StepCodex · 2026-05-29T03:37:19Z

[openclaw] Problem Human-facing chat channels can accidentally receive internal or operational text that was intended only for agent diagnostics, including too… ## Problem Human-facing chat channels can accidentally receive internal or operational text that was intended only for agent diagnostics, including tool/runtime/status details, debugging context, and implementation notes. This creates a risk that private operational state or low-level failure details leak into conversations where only clean user-facing outcomes should appear. There are also delivery semantics gaps that can cause agents or background workflows to report work as sent, attached, delivered, or done before there is a verified delivery record. Related failure modes include: - Conflating direct-message destinations with group-chat destinations. - Reporting an artifact as sent based only on a helper/tool success response rather than a verified message record. - Sending or referencing local filesystem paths instead of attaching the actual artifact. - Accidentally sending stale artifact versions after content or styling changes. - Losing clear state around whether a message/artifact is drafted, queued, attempted, sent, verified, failed, or superseded. These issues are especially risky for channels that reach humans directly, where internal diagnostics should stay local and outbound status should be precise. ## Proposal Add platform-level guardrails for outbound human-chat channels: 1. **Outbound sanitizer / policy gate** - Intercept messages before they leave through human-chat connectors. - Block or require explicit override for internal runtime/tool/status text, stack traces, shell snippets, local-only paths, credentials-like material, raw diagnostics, or other non-human-facing content. - Provide a clear failure reason to the agent/runtime without forwarding unsafe text to the human channel. 2. **Verified-send ledger** - Record each outbound attempt with destination type, destination identifier, channel, content hash/attachment identity, timestamp, and result. - Distinguish helper/tool success from verified message presence in the destination conversation. - Make verified delivery queryable by agents before they claim something was sent, attached, delivered, or done. 3. **Artifact delivery states** - Track artifact lifecycle states such as draft, current, superseded, queued, send attempted, sent unverified, verified, and failed. - Prevent stale artifact versions from being sent once a newer current version exists, unless explicitly overridden. - Encourage precise reporting such as done / not done / blocker / next action when verification fails. 4. **Explicit direct/group destination separation** - Model direct threads and group threads as distinct destination types. - Require outbound calls and delivery verification to bind to the exact intended destination. - Prevent successful delivery to one destination from satisfying verification for another. 5. **Regression tests** - Add tests for sanitizer blocks, verified-send requirements, direct-vs-group separation, attachment-vs-path behavior, and stale artifact prevention. - Include background/scheduled workflow coverage so automated jobs cannot bypass the same outbound policy gate. ## Acceptance criteria - Human-chat outbound messages pass through a centralized sanitizer/policy gate. - Agents cannot truthfully report sent/delivered/attached/done for a human-chat artifact without a verified delivery record. - Direct and group destinations are represented and verified separately. - Local-only artifact paths are not treated as delivered attachments. - Superseded artifact versions are blocked or clearly marked before send. - Regression tests cover the main failure modes above.

openclaw2026-05-29 03:37:19

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

RAW_BUFFERClick to expand / collapse

Problem

Human-facing chat channels can accidentally receive internal or operational text that was intended only for agent diagnostics, including tool/runtime/status details, debugging context, and implementation notes. This creates a risk that private operational state or low-level failure details leak into conversations where only clean user-facing outcomes should appear.

There are also delivery semantics gaps that can cause agents or background workflows to report work as sent, attached, delivered, or done before there is a verified delivery record. Related failure modes include:

Conflating direct-message destinations with group-chat destinations.
Reporting an artifact as sent based only on a helper/tool success response rather than a verified message record.
Sending or referencing local filesystem paths instead of attaching the actual artifact.
Accidentally sending stale artifact versions after content or styling changes.
Losing clear state around whether a message/artifact is drafted, queued, attempted, sent, verified, failed, or superseded.

These issues are especially risky for channels that reach humans directly, where internal diagnostics should stay local and outbound status should be precise.

Proposal

Add platform-level guardrails for outbound human-chat channels:

Outbound sanitizer / policy gate
- Intercept messages before they leave through human-chat connectors.
- Block or require explicit override for internal runtime/tool/status text, stack traces, shell snippets, local-only paths, credentials-like material, raw diagnostics, or other non-human-facing content.
- Provide a clear failure reason to the agent/runtime without forwarding unsafe text to the human channel.
Verified-send ledger
- Record each outbound attempt with destination type, destination identifier, channel, content hash/attachment identity, timestamp, and result.
- Distinguish helper/tool success from verified message presence in the destination conversation.
- Make verified delivery queryable by agents before they claim something was sent, attached, delivered, or done.
Artifact delivery states
- Track artifact lifecycle states such as draft, current, superseded, queued, send attempted, sent unverified, verified, and failed.
- Prevent stale artifact versions from being sent once a newer current version exists, unless explicitly overridden.
- Encourage precise reporting such as done / not done / blocker / next action when verification fails.
Explicit direct/group destination separation
- Model direct threads and group threads as distinct destination types.
- Require outbound calls and delivery verification to bind to the exact intended destination.
- Prevent successful delivery to one destination from satisfying verification for another.
Regression tests
- Add tests for sanitizer blocks, verified-send requirements, direct-vs-group separation, attachment-vs-path behavior, and stale artifact prevention.
- Include background/scheduled workflow coverage so automated jobs cannot bypass the same outbound policy gate.

Acceptance criteria

Human-chat outbound messages pass through a centralized sanitizer/policy gate.
Agents cannot truthfully report sent/delivered/attached/done for a human-chat artifact without a verified delivery record.
Direct and group destinations are represented and verified separately.
Local-only artifact paths are not treated as delivered attachments.
Superseded artifact versions are blocked or clearly marked before send.
Regression tests cover the main failure modes above.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering