openclaw - ✅(Solved) Fix [Feature]: Automatic Output Sanitization for Sensitive Data [4 pull requests, 1 participants]

937089773 · 2026-03-12T07:58:51Z

[openclaw] Add automatic output sanitization to prevent accidental leakage of sensitive information API keys, tokens, passwords, PII in agent responses, regard… Add automatic output sanitization to prevent accidental leakage of sensitive information (API keys, tokens, passwords, PII) in agent responses, regardless of whether the requester is a tester or administrator. # PR #20067: feat(plugins): add before_agent_reply hook (claiming pattern) - Repository: openclaw/openclaw - Author: JoshuaLelon - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/20067 ## Description (problem / solution / changelog) ## Summary - Adds a `before_agent_reply` plugin hook that fires after slash commands but before the LLM agent runs - Plugins can return `{ handled: true, reply }` to short-circuit agent processing (forms, wizards, approval gates, etc. as plugins without touching core) - Uses the `runClaimingHook` pattern (sequential by priority, first `{ handled: true }` wins) — same pattern as `inbound_claim` - Populates full `PluginHookAgentContext` including `trigger`, `channelId`, `messageProvider` ## Motivation Per VISION.md, core stays lean and optional capability should ship as plugins. Right now there's no way for a plugin to intercept an inbound message and return a synthetic reply before the LLM runs — anything that needs pre-LLM interception has to modify core. This hook fills that gap. Closes #8807. ## Design **Hook name:** `before_agent_reply` **When it fires:** After `handleInlineActions` returns `kind: "continue"`, before `stageSandboxMedia` / `runPreparedReply`. This means `/help` and other slash commands still work normally, even during a plugin dialog. **Event type:** ```ts { cleanedBody: string } // final user message heading to LLM ``` **Result type (claiming pattern):** ```ts { handled: boolean; // true = claim this message, short-circuit the LLM reply?: ReplyPayload; // synthetic reply (omit to silently swallow) reason?: string; // for logging/debugging } ``` **Context:** Full `PluginHookAgentContext` (`agentId`, `sessionKey`, `sessionId`, `workspaceDir`, `messageProvider`, `trigger`, `channelId`). **Execution:** `runClaimingHook` — async, sequential by priority (highest first). First handler to return `{ handled: true }` wins; remaining handlers are not called. When `handled: true` without `reply`, the message is swallowed via `SILENT_REPLY_TOKEN`. ## Changes | File | Lines | What | |---|---|---| | `src/plugins/types.ts` | +21 | Hook name, event/result types (with `handled: boolean`), handler map entry | | `src/plugins/hooks.ts` | +21 | `runBeforeAgentReply` using `runClaimingHook`, imports, re-exports | | `src/auto-reply/reply/get-reply.ts` | +26 | Hook call site after inline actions, before LLM | | `src/plugins/hooks.before-agent-reply.test.ts` | +123 | 8 tests: single claim, no hooks, first-claim-wins, swallow, decline-then-claim, all decline, error handling, hasHooks | ## Test plan - [x] `pnpm test -- src/plugins/hooks.before-agent-reply.test.ts` — all 8 tests pass - [x] `pnpm test -- src/plugins/hooks` — all 27 existing hook tests unaffected - [x] `pnpm test -- src/auto-reply/reply/get-reply` — existing get-reply tests pass - [x] `pnpm oxlint` / `pnpm format` — clean - [x] `pnpm tsgo` — no new type errors (pre-existing upstream errors only) - [x] `git diff upstream/main --stat` — exactly 4 files, no unrelated changes, no deletions of `inbound_claim` code ## Changed files - `src/auto-reply/reply/get-reply.ts` (modified, +26/-0) - `src/plugins/hooks.before-agent-reply.test.ts` (added, +123/-0) - `src/plugins/hooks.ts` (modified, +21/-0) - `src/plugins/types.ts` (modified, +21/-0) --- # PR #30329: feat(privacy): add privacy detection and replacement filter for LLM traffic - Repository: openclaw/openclaw - Author: bestcarly - State: closed | merged: False - Link: https://github.com/openclaw/openclaw/pull/30329 ## Description (problem / solution / changelog) ## Summary - Add a complete privacy filter pipeline that detects 50+ types of sensitive information (emails, phone numbers, API keys, credentials, PII, etc.) in text before sending to LLM, replaces them with format-preserving fake values, and restores originals in LLM responses - Support user-defined custom detection rules via JSON5 config files, enabling domain-specific patterns (e.g. employee IDs, regional phone formats), rule overrides, and selective disabling - Integrate privacy filtering into the LLM runner (prompt filtering + response restoration) and log redaction pipeline ## Motivation When users interact with LLM through OpenClaw, their prompts and context may contain sensitive information such as API keys, passwords, phone numbers, ID numbers, and other PII. This data is sent to external LLM providers, creating privacy risks. This module provides transparent, automatic privacy protection by: 1. **Detecting** sensitive content using regex patterns and keyword matching with contextual validation 2. **Replacing** detec

openclaw2026-03-12 07:58:51

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#43830•Fetched 2026-04-08 00:18:20

View on GitHub

Comments

Participants

Timeline

Reactions

Author

937089773

Participants

937089773

Timeline (top)

cross-referenced ×4labeled ×1

Add automatic output sanitization to prevent accidental leakage of sensitive information (API keys, tokens, passwords, PII) in agent responses, regardless of whether the requester is a tester or administrator.

Error Message

{ outputSanitization: { enabled: true, // Master switch mode: "auto", // auto | strict | off scope: "all", // all | tools | responses redactPatterns: [ // Built-in patterns (can be overridden) "\bAKIA[0-9A-Z]{16}\b", // AWS Access Key "\bsk-[a-zA-Z0-9]{48}\b", // OpenAI API Key "\bsk-ant-[a-zA-Z0-9-]{80,}\b", // Anthropic Key "\bghp_[a-zA-Z0-9]{36}\b", // GitHub Token "\b[A-Za-z0-9.%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b", // Email "\b\d{3}-\d{2}-\d{4}\b", // SSN "\b\d{4}-\d{4}-\d{4}-\d{4}\b", // Credit Card "password\s*[:=]\s*'"['"]", // Passwords "-----BEGIN.*PRIVATE KEY-----", // Private Keys "eyJ[a-zA-Z0-9-]+\.eyJ[a-zA-Z0-9_-]+", // JWT Tokens "<db-scheme>://[^\s]+", // Database URLs ], exceptions: { // Allow specific patterns in certain contexts allowInCodeBlocks: false, // Don't redact in code blocks allowInQuotes: false, // Don't redact in quoted strings allowList: [], // Specific patterns to never redact }, reporting: { enabled: true, // Log redaction events level: "warn", // log level for redactions includeOriginal: false, // Never log original sensitive values } } }

Root Cause

Fix Action

Fixed

Fixed by PR: feat(plugins): add before_agent_reply hook (claiming pattern) (https://github.com/openclaw/openclaw/pull/20067)
Fixed by PR: feat(privacy): add privacy detection and replacement filter for LLM traffic (https://github.com/openclaw/openclaw/pull/30329)
Fixed by PR: fix(privacy): harden stream filter and address review feedback (https://github.com/openclaw/openclaw/pull/45619)
Fixed by PR: feat(privacy): add privacy detection and replacement filter for LLM traffic (https://github.com/openclaw/openclaw/pull/45783)

PR fix notes

PR #20067: feat(plugins): add before_agent_reply hook (claiming pattern)

Repository: openclaw/openclaw
Author: JoshuaLelon
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/20067

Description (problem / solution / changelog)

Summary

Adds a before_agent_reply plugin hook that fires after slash commands but before the LLM agent runs
Plugins can return { handled: true, reply } to short-circuit agent processing (forms, wizards, approval gates, etc. as plugins without touching core)
Uses the runClaimingHook pattern (sequential by priority, first { handled: true } wins) — same pattern as inbound_claim
Populates full PluginHookAgentContext including trigger, channelId, messageProvider

Motivation

Per VISION.md, core stays lean and optional capability should ship as plugins. Right now there's no way for a plugin to intercept an inbound message and return a synthetic reply before the LLM runs — anything that needs pre-LLM interception has to modify core. This hook fills that gap.

Closes #8807.

Design

Hook name: before_agent_reply

When it fires: After handleInlineActions returns kind: "continue", before stageSandboxMedia / runPreparedReply. This means /help and other slash commands still work normally, even during a plugin dialog.

Event type:

{ cleanedBody: string }  // final user message heading to LLM

Result type (claiming pattern):

{
  handled: boolean;      // true = claim this message, short-circuit the LLM
  reply?: ReplyPayload;  // synthetic reply (omit to silently swallow)
  reason?: string;       // for logging/debugging
}

Context: Full PluginHookAgentContext (agentId, sessionKey, sessionId, workspaceDir, messageProvider, trigger, channelId).

Execution: runClaimingHook — async, sequential by priority (highest first). First handler to return { handled: true } wins; remaining handlers are not called. When handled: true without reply, the message is swallowed via SILENT_REPLY_TOKEN.

Changes

File	Lines	What
`src/plugins/types.ts`	+21	Hook name, event/result types (with `handled: boolean`), handler map entry
`src/plugins/hooks.ts`	+21	`runBeforeAgentReply` using `runClaimingHook`, imports, re-exports
`src/auto-reply/reply/get-reply.ts`	+26	Hook call site after inline actions, before LLM
`src/plugins/hooks.before-agent-reply.test.ts`	+123	8 tests: single claim, no hooks, first-claim-wins, swallow, decline-then-claim, all decline, error handling, hasHooks

Test plan

pnpm test -- src/plugins/hooks.before-agent-reply.test.ts — all 8 tests pass
pnpm test -- src/plugins/hooks — all 27 existing hook tests unaffected
pnpm test -- src/auto-reply/reply/get-reply — existing get-reply tests pass
pnpm oxlint / pnpm format — clean
pnpm tsgo — no new type errors (pre-existing upstream errors only)
git diff upstream/main --stat — exactly 4 files, no unrelated changes, no deletions of inbound_claim code

Changed files

src/auto-reply/reply/get-reply.ts (modified, +26/-0)
src/plugins/hooks.before-agent-reply.test.ts (added, +123/-0)
src/plugins/hooks.ts (modified, +21/-0)
src/plugins/types.ts (modified, +21/-0)

PR #30329: feat(privacy): add privacy detection and replacement filter for LLM traffic

Repository: openclaw/openclaw
Author: bestcarly
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/30329

Description (problem / solution / changelog)

Summary

Add a complete privacy filter pipeline that detects 50+ types of sensitive information (emails, phone numbers, API keys, credentials, PII, etc.) in text before sending to LLM, replaces them with format-preserving fake values, and restores originals in LLM responses
Support user-defined custom detection rules via JSON5 config files, enabling domain-specific patterns (e.g. employee IDs, regional phone formats), rule overrides, and selective disabling
Integrate privacy filtering into the LLM runner (prompt filtering + response restoration) and log redaction pipeline

Motivation

When users interact with LLM through OpenClaw, their prompts and context may contain sensitive information such as API keys, passwords, phone numbers, ID numbers, and other PII. This data is sent to external LLM providers, creating privacy risks. This module provides transparent, automatic privacy protection by:

Detecting sensitive content using regex patterns and keyword matching with contextual validation
Replacing detected content with format-preserving fake values so the LLM can still understand semantic context
Restoring originals in LLM responses so the user sees correct information
Persisting mappings with AES-256-GCM encryption for session continuity

Key Design Decisions

Format-preserving replacement: Fake values maintain the same format (e.g. email → email, phone → phone) so LLM responses remain coherent
Session-scoped idempotency: Same original text always maps to the same replacement within a session
Custom rules via JSON5: Users can extend/override/disable built-in rules without modifying source code
ReDoS prevention: User-provided regex patterns are validated for nested quantifiers and length limits
Named validator registry: Solves the problem of JSON not supporting function serialization for complex validations

Test plan

167 tests across 8 test files all passing
Core detection: all 64 enabled rule types have coverage (rules.test.ts)
Replacement round-trip: filter → restore produces original text (stream-wrapper.test.ts)
Custom rules: validation, merge, disable, template expansion, regex safety (custom-rules.test.ts)
Encrypted persistence: save/load/TTL expiry (mapping-store.test.ts)
Config schema: Zod validation for privacy config (privacy-config.test.ts)
No regressions in existing test suite

This contribution was developed with AI assistance (Claude, Codex).

Changed files

src/agents/pi-embedded-runner/run/attempt.ts (modified, +17/-0)
src/config/privacy-config.test.ts (added, +37/-0)
src/config/types.base.ts (modified, +18/-0)
src/config/types.openclaw.ts (modified, +8/-1)
src/config/zod-schema.ts (modified, +27/-0)
src/logging/redact.test.ts (modified, +11/-1)
src/logging/redact.ts (modified, +85/-9)
src/privacy/README.en.md (added, +508/-0)
src/privacy/README.md (added, +506/-0)
src/privacy/custom-rules.test.ts (added, +472/-0)
src/privacy/custom-rules.ts (added, +295/-0)
src/privacy/detector.test.ts (added, +212/-0)
src/privacy/detector.ts (added, +353/-0)
src/privacy/index.ts (added, +41/-0)
src/privacy/mapping-store.test.ts (added, +149/-0)
src/privacy/mapping-store.ts (added, +285/-0)
src/privacy/replacer.test.ts (added, +140/-0)
src/privacy/replacer.ts (added, +314/-0)
src/privacy/rules.test.ts (added, +167/-0)
src/privacy/rules.ts (added, +518/-0)
src/privacy/stream-wrapper.test.ts (added, +140/-0)
src/privacy/stream-wrapper.ts (added, +359/-0)
src/privacy/types.ts (added, +183/-0)

PR #45619: fix(privacy): harden stream filter and address review feedback

Repository: openclaw/openclaw
Author: bestcarly
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/45619

Description (problem / solution / changelog)

Summary

Problem: the initial privacy filter implementation had gaps that left secrets exposed in certain message paths and could produce unrestorable placeholder cascades.
Why it matters: unfiltered toolResult messages, systemPrompt, and double-filtered prompts can leak sensitive data to external LLM providers or show garbled placeholders to users.
What changed: removed redundant filterPrompt call to prevent cascaded mappings; added filtering for toolResult-role messages and systemPrompt; hardened mapping-store writes, log redaction config, custom rule validation, and TTL cleanup.
What did NOT change (scope boundary): no new features, config keys, network calls, or auth changes; only hardens existing privacy filter paths.

Change Type (select all)

Bug fix
Security hardening

Scope (select all touched areas)

Gateway / orchestration
Memory / storage

Linked Issue/PR

Related #45619 (addresses bot review feedback)

User-visible / Behavior Changes

toolResult messages are now privacy-filtered before being sent to LLM providers.
systemPrompt is now privacy-filtered before provider dispatch.
Prompt text is no longer double-filtered, preventing placeholder leakage.
Log redaction now respects privacy.enabled and privacy.rules config.
Expired mapping entries are cleaned up on session startup per configured TTL.

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? Yes
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation:

Secrets handling improved: more message types are now filtered (toolResult, systemPrompt) and double-filtering is eliminated. Risk is low — these are strictly additive hardening changes with no new attack surface.

Repro + Verification

Environment

OS: macOS
Runtime/container: Node 22 + Vitest
Model/provider: N/A (unit tests)

Steps

Enable privacy filtering in config.
Send a message containing sensitive data (e.g. email, API key) that triggers a tool call.
Observe that toolResult content and systemPrompt are filtered before reaching the LLM provider.

Expected

All message roles (user, assistant, toolResult) are filtered.
systemPrompt is filtered.
No cascaded/double placeholders appear in user-visible output.
Expired mappings are cleaned up on startup.

Actual

Matched expected behavior across 153 unit tests.

Evidence

Failing test/log before + passing after

 Test Files  6 passed (6)
      Tests  153 passed (153)
   Duration  969ms

Human Verification (required)

Verified scenarios: ran full privacy test suite locally; confirmed filterMessages handles all three message roles; confirmed filterPrompt removal eliminates cascaded mappings; confirmed TTL cleanup runs on context init.
Edge cases checked: malformed custom rule keywords (non-string), non-text content blocks in toolResult, empty/missing systemPrompt.
What you did not verify: full end-to-end with live LLM provider across all channels.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No

Failure Recovery (if this breaks)

How to disable/revert this change quickly: set privacy.enabled to false.
Files/config to restore: revert this single commit.
Known bad symptoms reviewers should watch for: placeholder leakage in model output, missing tool results in conversations.

Risks and Mitigations

Risk: filtering toolResult messages could over-redact legitimate tool output that resembles sensitive patterns.
- Mitigation: same detection rules and false-positive suppression as user/assistant messages; no new rule types added.

Changed files

src/agents/pi-embedded-runner/run/attempt.ts (modified, +24/-0)
src/config/privacy-config.test.ts (added, +37/-0)
src/config/types.base.ts (modified, +26/-0)
src/config/types.openclaw.ts (modified, +8/-1)
src/config/zod-schema.ts (modified, +27/-0)
src/logging/redact.test.ts (modified, +11/-1)
src/logging/redact.ts (modified, +85/-9)
src/privacy/README.en.md (added, +508/-0)
src/privacy/README.md (added, +506/-0)
src/privacy/custom-rules.test.ts (added, +472/-0)
src/privacy/custom-rules.ts (added, +295/-0)
src/privacy/detector.test.ts (added, +212/-0)
src/privacy/detector.ts (added, +353/-0)
src/privacy/index.ts (added, +41/-0)
src/privacy/mapping-store.test.ts (added, +149/-0)
src/privacy/mapping-store.ts (added, +285/-0)
src/privacy/replacer.test.ts (added, +140/-0)
src/privacy/replacer.ts (added, +314/-0)
src/privacy/rules.test.ts (added, +167/-0)
src/privacy/rules.ts (added, +518/-0)
src/privacy/stream-wrapper.test.ts (added, +140/-0)
src/privacy/stream-wrapper.ts (added, +359/-0)
src/privacy/types.ts (added, +183/-0)

PR #45783: feat(privacy): add privacy detection and replacement filter for LLM traffic

Repository: openclaw/openclaw
Author: bestcarly
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/45783

Description (problem / solution / changelog)

Summary

Problem: prompts and tool context sent to external LLM providers can contain secrets, credentials, phone numbers, and other PII — OpenClaw has log redaction but nothing that scrubs the actual content before it leaves the gateway.
Why it matters: users relying on third-party or proxy-hosted LLM services have no automatic protection against sensitive data exposure in outbound API calls.
What changed: adds a complete privacy filter pipeline — detection (50+ rule types), format-preserving replacement, encrypted mapping persistence, bidirectional stream filtering (outbound scrub + inbound restore), custom user rules via JSON5, and log redaction integration.
What did NOT change (scope boundary): no auth, channel, provider selection, or network destination changes; only transforms eligible text at the LLM boundary and log-redaction path.

Change Type (select all)

Feature
Security hardening

Scope (select all touched areas)

Gateway / orchestration
Memory / storage

Linked Issue/PR

Related #37815
Related #43830
Related #44195
Supersedes #30329
Supersedes #45619

User-visible / Behavior Changes

New privacy config section: privacy.enabled (default true), privacy.rules ("basic" | "extended" | custom path), privacy.encryption, privacy.mappings (TTL, store path), privacy.log.
Sensitive content (emails, phone numbers, API keys, tokens, Chinese PII, etc.) is automatically replaced with format-preserving placeholders before outbound LLM requests.
Original values are restored in returned model text where mappings exist.
All message roles (user, assistant, toolResult) and systemPrompt are filtered.
Log redaction now uses the same privacy detector for broader coverage, respecting privacy.enabled and privacy.rules.
Users can define custom detection rules via JSON5 config files.

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? Yes
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation:

This PR reads outbound/inbound LLM text and stores replacement mappings locally. Risk is limited by: AES-256-GCM encrypted-at-rest mapping storage with atomic write-then-rename, file locking for concurrent access, opt-out via privacy.enabled: false, ReDoS safety validation for custom regex rules, and no expansion of command execution or network access.

Repro + Verification

Environment

OS: macOS (Darwin 25.3.0)
Runtime/container: Node 22 + Vitest 4.0.18
Model/provider: N/A (unit tests)

Steps

Enable privacy filtering (default on).
Send text containing sensitive values through the LLM filter path.
Observe outbound text has placeholders, inbound text is restored.

Expected

Sensitive values replaced before LLM egress across all message roles.
Returned model text restores original values where mappings exist.
Log redaction covers the same sensitive patterns.
Custom rules are validated (including ReDoS safety) and merged safely.
Mapping persistence is crash-safe (atomic writes) and encrypted.

Actual

Matched expected behavior across 168 unit tests covering all components.

Evidence

Failing test/log before + passing after

Full test suite output (8 test files, 168 tests, 0 failures):

 ✓ src/privacy/mapping-store.test.ts    (8 tests)   426ms
 ✓ src/privacy/stream-wrapper.test.ts   (11 tests)  478ms
 ✓ src/privacy/rules.test.ts            (65 tests)  12ms
 ✓ src/privacy/detector.test.ts         (26 tests)  11ms
 ✓ src/privacy/custom-rules.test.ts     (32 tests)  8ms
 ✓ src/logging/redact.test.ts           (13 tests)  9ms
 ✓ src/privacy/replacer.test.ts         (11 tests)  4ms
 ✓ src/config/privacy-config.test.ts    (2 tests)   3ms

 Test Files  8 passed (8)
      Tests  168 passed (168)
   Duration  1.30s

Test coverage includes:

detector.test.ts (26 tests): regex + keyword matching, contextual validation, false-positive reduction, custom rule loading
replacer.test.ts (11 tests): format-preserving replacement, session-scoped idempotency, restore round-trip
mapping-store.test.ts (8 tests): AES-256-GCM encrypted persistence, atomic writes, file locking, TTL cleanup, concurrent access
stream-wrapper.test.ts (11 tests): outbound message filtering (user/assistant/toolResult/systemPrompt), inbound stream restoration (text + tool-call arguments), privacy config gating
rules.test.ts (65 tests): all 50+ built-in detection rules across email, phone, API key, token, PII, Chinese ID patterns
custom-rules.test.ts (32 tests): JSON5 rule parsing, validation, ReDoS safety rejection, rule merging, preset extension
redact.test.ts (13 tests): log redaction integration, privacy-enabled gating, configurable rulesets
privacy-config.test.ts (2 tests): Zod schema validation for privacy config section

Human Verification (required)

Verified scenarios: ran full test suite locally; confirmed detection, replacement/restoration round-trip, encrypted persistence, custom rule loading/validation, config schema parsing, log redaction integration, and all message role filtering.
Edge cases checked: heuristic false-positive suppression (bare_password, high_entropy_string skipped in stream filter), malformed custom rule keywords (non-string rejection), mapping TTL expiry and cleanup, non-fatal persistence-failure handling, empty/missing systemPrompt, non-text content blocks in toolResult, concurrent mapping store access with file locking.
What you did not verify: full end-to-end validation with live LLM providers across all channel combinations; performance profiling under high-throughput production traffic.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? Yes
Config/env changes? Yes (new optional privacy config section; defaults to enabled)
Migration needed? No

Failure Recovery (if this breaks)

How to disable/revert this change quickly: set privacy.enabled to false in config.
Files/config to restore: revert this single commit.
Known bad symptoms reviewers should watch for: placeholder leakage in model output (pf_* tokens visible to users), over-redaction of legitimate content, mapping persistence failures in logs.

Risks and Mitigations

Risk: false positives could replace non-sensitive strings and degrade model context.
- Mitigation: contextual validation, heuristic-type suppression in stream filter, custom disable/override support.
Risk: custom regex rules could be catastrophically slow (ReDoS).
- Mitigation: compileSafeRegex validation rejects ambiguous alternation under repetition, repeated .* groups, and other unsafe patterns before loading.
Risk: local mapping persistence could expose originals if stored insecurely.
- Mitigation: AES-256-GCM encryption, owner-only file permissions, atomic write-then-rename, session-scoped TTL cleanup.

Changed files

src/agents/btw.test.ts (modified, +55/-0)
src/agents/btw.ts (modified, +19/-1)
src/agents/pi-embedded-runner/run/attempt.ts (modified, +119/-25)
src/config/privacy-config.test.ts (added, +37/-0)
src/config/types.base.ts (modified, +16/-0)
src/config/types.openclaw.ts (modified, +8/-1)
src/config/zod-schema.ts (modified, +27/-0)
src/logging/redact.test.ts (modified, +233/-2)
src/logging/redact.ts (modified, +184/-9)
src/privacy/README.en.md (added, +508/-0)
src/privacy/README.md (added, +506/-0)
src/privacy/custom-rules.test.ts (added, +685/-0)
src/privacy/custom-rules.ts (added, +441/-0)
src/privacy/detector.test.ts (added, +256/-0)
src/privacy/detector.ts (added, +365/-0)
src/privacy/index.ts (added, +41/-0)
src/privacy/mapping-store.test.ts (added, +198/-0)
src/privacy/mapping-store.ts (added, +293/-0)
src/privacy/replacer.test.ts (added, +185/-0)
src/privacy/replacer.ts (added, +370/-0)
src/privacy/rules.test.ts (added, +185/-0)
src/privacy/rules.ts (added, +521/-0)
src/privacy/stream-wrapper.test.ts (added, +663/-0)
src/privacy/stream-wrapper.ts (added, +759/-0)
src/privacy/types.ts (added, +183/-0)
src/tts/tts-core.ts (modified, +18/-4)
src/tts/tts.test.ts (modified, +36/-0)

Code Example

{
  outputSanitization: {
    enabled: true,              // Master switch
    mode: "auto",               // auto | strict | off
    scope: "all",               // all | tools | responses
    redactPatterns: [
      // Built-in patterns (can be overridden)
      "\\bAKIA[0-9A-Z]{16}\\b",                    // AWS Access Key
      "\\bsk-[a-zA-Z0-9]{48}\\b",                  // OpenAI API Key
      "\\bsk-ant-[a-zA-Z0-9-]{80,}\\b",            // Anthropic Key
      "\\bghp_[a-zA-Z0-9]{36}\\b",                 // GitHub Token
      "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b",  // Email
      "\\b\\d{3}-\\d{2}-\\d{4}\\b",                // SSN
      "\\b\\d{4}-\\d{4}-\\d{4}-\\d{4}\\b",         // Credit Card
      "password\\s*[:=]\\s*['\"]([^'\"]+)['\"]",    // Passwords
      "-----BEGIN.*PRIVATE KEY-----",              // Private Keys
      "eyJ[a-zA-Z0-9_-]+\\.eyJ[a-zA-Z0-9_-]+",     // JWT Tokens
      "<db-scheme>://[^\\s]+",                     // Database URLs
    ],
    exceptions: {
      // Allow specific patterns in certain contexts
      allowInCodeBlocks: false,    // Don't redact in code blocks
      allowInQuotes: false,        // Don't redact in quoted strings
      allowList: [],               // Specific patterns to never redact
    },
    reporting: {
      enabled: true,              // Log redaction events
      level: "warn",              // log level for redactions
      includeOriginal: false,     // Never log original sensitive values
    }
  }
}

---

Agent Response → Sanitization Middleware → Channel Delivery

---

{
  agents: {
    defaults: {
      postProcess: [
        {
          type: "output-sanitizer",
          config: { mode: "auto" }
        }
      ]
    }
  }
}

---

const patterns = {
  awsAccessKey: /\bAKIA[0-9A-Z]{16}\b/g,
  openAIKey: /\bsk-[a-zA-Z0-9]{48}\b/g,
  anthropicKey: /\bsk-ant-[a-zA-Z0-9-]{80,}\b/g,
  githubToken: /\bghp_[a-zA-Z0-9]{36}\b/g,
  email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
  ssn: /\b\d{3}-\d{2}-\d{4}\b/g,
  creditCard: /\b\d{4}-\d{4}-\d{4}-\d{4}\b/g,
  password: /password\s*[:=]\s*['"]([^'"]+)['"]/gi,
  privateKey: /-----BEGIN.*PRIVATE KEY-----/g,
  jwt: /eyJ[a-zA-Z0-9_-]+\.eyJ[a-zA-Z0-9_-]+/g,
  databaseUrl: /<(postgres|mysql|mongodb|redis):\/\/[^\s]+>/g,
};

---

{
  "event": "output_sanitized",
  "timestamp": "2026-03-12T07:35:00Z",
  "sessionKey": "agent:main:webchat",
  "redactions": [
    {
      "type": "api_key",
      "pattern": "openai_key",
      "line": 15,
      "action": "replaced_with_redacted"
    },
    {
      "type": "email",
      "pattern": "email",
      "line": 28,
      "action": "masked_local_part"
    }
  ],
  "totalRedactions": 2
}

---

{
  agents: {
    defaults: {
      outputSanitization: {
        enabled: true,
        mode: "auto"
      }
    },
    list: [
      {
        id: "main",
        outputSanitization: {
          enabled: true,
          mode: "strict"  // Override for main agent
        }
      },
      {
        id: "public",
        outputSanitization: {
          enabled: true,
          mode: "strict",
          scope: "all"  // Maximum security for public agent
        }
      }
    ]
  }
}

RAW_BUFFERClick to expand / collapse

Summary

Problem to solve

Currently, OpenClaw has an output-sanitizer skill that provides guidance on redacting sensitive information, but it is not automatically applied to agent outputs. This means:

Security Risk: When agents read configuration files (like openclaw.json) or other sensitive files, they may return complete content including API keys, tokens, and passwords
Testing Issues: Test personnel requesting full file information receive unredacted sensitive data
No Automatic Protection: There is no configuration option to enable automatic output filtering
Manual Dependency: Users must manually remember to apply sanitization rules

Proposed solution

1. Configuration Option

Add a new configuration section to enable automatic output sanitization:

{
  outputSanitization: {
    enabled: true,              // Master switch
    mode: "auto",               // auto | strict | off
    scope: "all",               // all | tools | responses
    redactPatterns: [
      // Built-in patterns (can be overridden)
      "\\bAKIA[0-9A-Z]{16}\\b",                    // AWS Access Key
      "\\bsk-[a-zA-Z0-9]{48}\\b",                  // OpenAI API Key
      "\\bsk-ant-[a-zA-Z0-9-]{80,}\\b",            // Anthropic Key
      "\\bghp_[a-zA-Z0-9]{36}\\b",                 // GitHub Token
      "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b",  // Email
      "\\b\\d{3}-\\d{2}-\\d{4}\\b",                // SSN
      "\\b\\d{4}-\\d{4}-\\d{4}-\\d{4}\\b",         // Credit Card
      "password\\s*[:=]\\s*['\"]([^'\"]+)['\"]",    // Passwords
      "-----BEGIN.*PRIVATE KEY-----",              // Private Keys
      "eyJ[a-zA-Z0-9_-]+\\.eyJ[a-zA-Z0-9_-]+",     // JWT Tokens
      "<db-scheme>://[^\\s]+",                     // Database URLs
    ],
    exceptions: {
      // Allow specific patterns in certain contexts
      allowInCodeBlocks: false,    // Don't redact in code blocks
      allowInQuotes: false,        // Don't redact in quoted strings
      allowList: [],               // Specific patterns to never redact
    },
    reporting: {
      enabled: true,              // Log redaction events
      level: "warn",              // log level for redactions
      includeOriginal: false,     // Never log original sensitive values
    }
  }
}

2. Modes of Operation

`auto` (Recommended)

Apply sanitization to all outputs
Use smart detection (context-aware)
Preserve code structure when possible
Mask sensitive values while maintaining readability

`strict`

Aggressive redaction
Any potential sensitive pattern is redacted
Higher false positive rate, maximum security

`off`

Disable automatic sanitization
Manual application only (current behavior)

3. Scope Options

`all`

Sanitize all agent outputs (responses, tool results, file reads)

`tools`

Only sanitize tool outputs (file reads, exec results, etc.)

`responses`

Only sanitize final agent responses

4. Implementation Approach

Option A: Middleware Layer (Recommended)

Add a sanitization middleware in the agent output pipeline:

Agent Response → Sanitization Middleware → Channel Delivery

Pros:

Centralized control
Consistent behavior across all channels
Easy to configure and debug
Performance impact minimal

Cons:

Requires changes to core agent loop

Option B: Skill-Based Auto-Application

Enhance the output-sanitizer skill to automatically run:

Pros:

Leverages existing skill infrastructure
Less invasive code changes
Skill can be updated independently

Cons:

Skills run after agent response
May not catch all output paths
More complex to maintain

Option C: Post-Processing Hook

Add a post-processing hook that can be configured:

{
  agents: {
    defaults: {
      postProcess: [
        {
          type: "output-sanitizer",
          config: { mode: "auto" }
        }
      ]
    }
  }
}

Pros:

Flexible and extensible
Can add multiple processors
Easy to enable/disable per agent

Cons:

Adds complexity to configuration
Requires new hook infrastructure

5. Redaction Strategy

Pattern Matching

Use regex patterns to detect sensitive data:

const patterns = {
  awsAccessKey: /\bAKIA[0-9A-Z]{16}\b/g,
  openAIKey: /\bsk-[a-zA-Z0-9]{48}\b/g,
  anthropicKey: /\bsk-ant-[a-zA-Z0-9-]{80,}\b/g,
  githubToken: /\bghp_[a-zA-Z0-9]{36}\b/g,
  email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
  ssn: /\b\d{3}-\d{2}-\d{4}\b/g,
  creditCard: /\b\d{4}-\d{4}-\d{4}-\d{4}\b/g,
  password: /password\s*[:=]\s*['"]([^'"]+)['"]/gi,
  privateKey: /-----BEGIN.*PRIVATE KEY-----/g,
  jwt: /eyJ[a-zA-Z0-9_-]+\.eyJ[a-zA-Z0-9_-]+/g,
  databaseUrl: /<(postgres|mysql|mongodb|redis):\/\/[^\s]+>/g,
};

Replacement Strategy

Conservative (Default):

Replace with [REDACTED]
Preserve context and structure
Example: apiKey: "sk-abc123..." → apiKey: "[REDACTED]"

Masked:

Show partial value
Example: sk-abc123...xyz → sk-****...****

Hashed:

Replace with hash (for debugging)
Example: sk-abc123... → [REDACTED:sha256:abc123...]

6. Reporting and Logging

When redaction occurs, log a structured event:

{
  "event": "output_sanitized",
  "timestamp": "2026-03-12T07:35:00Z",
  "sessionKey": "agent:main:webchat",
  "redactions": [
    {
      "type": "api_key",
      "pattern": "openai_key",
      "line": 15,
      "action": "replaced_with_redacted"
    },
    {
      "type": "email",
      "pattern": "email",
      "line": 28,
      "action": "masked_local_part"
    }
  ],
  "totalRedactions": 2
}

7. Per-Agent Configuration

Allow per-agent overrides:

{
  agents: {
    defaults: {
      outputSanitization: {
        enabled: true,
        mode: "auto"
      }
    },
    list: [
      {
        id: "main",
        outputSanitization: {
          enabled: true,
          mode: "strict"  // Override for main agent
        }
      },
      {
        id: "public",
        outputSanitization: {
          enabled: true,
          mode: "strict",
          scope: "all"  // Maximum security for public agent
        }
      }
    ]
  }
}

8. Backward Compatibility

Default to enabled: false to maintain current behavior
Add migration path in openclaw doctor
Document breaking changes clearly
Provide upgrade guide

Alternatives considered

No response

Impact

Priority: High Complexity: Medium Risk: Low (backward compatible) Impact: High (security improvement)

Evidence/examples

No response

Additional information

No response

extent analysis

Problem Summary

Add a global output‑sanitizer middleware that automatically redacts secrets in every agent response (tool results, final messages, etc.) and make it configurable via outputSanitization in openclaw.json.

Root Cause Analysis

The existing output‑sanitizer skill is only invoked manually.
No hook in the response pipeline forces its execution, so any raw file read or tool output can leak secrets.

Fix Plan

1. Add Config Schema

// src/config/schema.ts
export interface OutputSanitizationConfig {
  enabled: boolean;          // master switch
  mode: 'auto' | 'strict' | 'off';
  scope: 'all' | 'tools' | 'responses';
  redactPatterns: string[];  // regex strings
  exceptions?: {
    allowInCodeBlocks?: boolean;
    allowInQuotes?: boolean;
    allowList?: string[];
  };
  reporting?: {
    enabled: boolean;
    level: 'info' | 'warn' | 'error';
    includeOriginal: boolean;
  };
}

Add defaults (mirroring the JSON in the issue) in defaultConfig.ts.
Make the top‑level config load it:

// src/config/load.ts
import { OutputSanitizationConfig } from './schema';
const defaultSanitizer: OutputSanitizationConfig = {
  enabled: false,
  mode: 'auto',
  scope: 'all',
  redactPatterns: [
    '\\bAKIA[0-9A-Z]{16}\\b',
    '\\bsk-[a-zA-Z0-9]{48}\\b',
    '\\bsk-ant-[a-zA-Z0-9-]{80,}\\b',
    '\\bghp_[a-zA-Z0-9]{36}\\b',
    '\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}\\b',
    '\\b\\d{

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #environment setup #docker error #permission error #memory optimization

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Feature]: Automatic Output Sanitization for Sensitive Data [4 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #20067: feat(plugins): add before_agent_reply hook (claiming pattern)

Description (problem / solution / changelog)

Summary

Motivation

Design

Changes

Test plan

Changed files

PR #30329: feat(privacy): add privacy detection and replacement filter for LLM traffic

Description (problem / solution / changelog)

Summary

Motivation

Key Design Decisions

Test plan

Changed files

PR #45619: fix(privacy): harden stream filter and address review feedback

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Changed files

PR #45783: feat(privacy): add privacy detection and replacement filter for LLM traffic

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Changed files

Code Example

Summary

Problem to solve

Proposed solution

1. Configuration Option

2. Modes of Operation

auto (Recommended)

strict

off

3. Scope Options

all

tools

responses

4. Implementation Approach

Option A: Middleware Layer (Recommended)

Option B: Skill-Based Auto-Application

Option C: Post-Processing Hook

`auto` (Recommended)

`strict`

`off`

`all`

`tools`

`responses`