openclaw - ✅(Solved) Fix [Feature]: Automatic Output Sanitization for Sensitive Data [4 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#43830Fetched 2026-04-08 00:18:20
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×4labeled ×1

Add automatic output sanitization to prevent accidental leakage of sensitive information (API keys, tokens, passwords, PII) in agent responses, regardless of whether the requester is a tester or administrator.

Error Message

{ outputSanitization: { enabled: true, // Master switch mode: "auto", // auto | strict | off scope: "all", // all | tools | responses redactPatterns: [ // Built-in patterns (can be overridden) "\bAKIA[0-9A-Z]{16}\b", // AWS Access Key "\bsk-[a-zA-Z0-9]{48}\b", // OpenAI API Key "\bsk-ant-[a-zA-Z0-9-]{80,}\b", // Anthropic Key "\bghp_[a-zA-Z0-9]{36}\b", // GitHub Token "\b[A-Za-z0-9.%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b", // Email "\b\d{3}-\d{2}-\d{4}\b", // SSN "\b\d{4}-\d{4}-\d{4}-\d{4}\b", // Credit Card "password\s*[:=]\s*'"['"]", // Passwords "-----BEGIN.*PRIVATE KEY-----", // Private Keys "eyJ[a-zA-Z0-9-]+\.eyJ[a-zA-Z0-9_-]+", // JWT Tokens "<db-scheme>://[^\s]+", // Database URLs ], exceptions: { // Allow specific patterns in certain contexts allowInCodeBlocks: false, // Don't redact in code blocks allowInQuotes: false, // Don't redact in quoted strings allowList: [], // Specific patterns to never redact }, reporting: { enabled: true, // Log redaction events level: "warn", // log level for redactions includeOriginal: false, // Never log original sensitive values } } }

Root Cause

Add automatic output sanitization to prevent accidental leakage of sensitive information (API keys, tokens, passwords, PII) in agent responses, regardless of whether the requester is a tester or administrator.

Fix Action

Fixed

PR fix notes

PR #20067: feat(plugins): add before_agent_reply hook (claiming pattern)

Description (problem / solution / changelog)

Summary

  • Adds a before_agent_reply plugin hook that fires after slash commands but before the LLM agent runs
  • Plugins can return { handled: true, reply } to short-circuit agent processing (forms, wizards, approval gates, etc. as plugins without touching core)
  • Uses the runClaimingHook pattern (sequential by priority, first { handled: true } wins) — same pattern as inbound_claim
  • Populates full PluginHookAgentContext including trigger, channelId, messageProvider

Motivation

Per VISION.md, core stays lean and optional capability should ship as plugins. Right now there's no way for a plugin to intercept an inbound message and return a synthetic reply before the LLM runs — anything that needs pre-LLM interception has to modify core. This hook fills that gap.

Closes #8807.

Design

Hook name: before_agent_reply

When it fires: After handleInlineActions returns kind: "continue", before stageSandboxMedia / runPreparedReply. This means /help and other slash commands still work normally, even during a plugin dialog.

Event type:

{ cleanedBody: string }  // final user message heading to LLM

Result type (claiming pattern):

{
  handled: boolean;      // true = claim this message, short-circuit the LLM
  reply?: ReplyPayload;  // synthetic reply (omit to silently swallow)
  reason?: string;       // for logging/debugging
}

Context: Full PluginHookAgentContext (agentId, sessionKey, sessionId, workspaceDir, messageProvider, trigger, channelId).

Execution: runClaimingHook — async, sequential by priority (highest first). First handler to return { handled: true } wins; remaining handlers are not called. When handled: true without reply, the message is swallowed via SILENT_REPLY_TOKEN.

Changes

FileLinesWhat
src/plugins/types.ts+21Hook name, event/result types (with handled: boolean), handler map entry
src/plugins/hooks.ts+21runBeforeAgentReply using runClaimingHook, imports, re-exports
src/auto-reply/reply/get-reply.ts+26Hook call site after inline actions, before LLM
src/plugins/hooks.before-agent-reply.test.ts+1238 tests: single claim, no hooks, first-claim-wins, swallow, decline-then-claim, all decline, error handling, hasHooks

Test plan

  • pnpm test -- src/plugins/hooks.before-agent-reply.test.ts — all 8 tests pass
  • pnpm test -- src/plugins/hooks — all 27 existing hook tests unaffected
  • pnpm test -- src/auto-reply/reply/get-reply — existing get-reply tests pass
  • pnpm oxlint / pnpm format — clean
  • pnpm tsgo — no new type errors (pre-existing upstream errors only)
  • git diff upstream/main --stat — exactly 4 files, no unrelated changes, no deletions of inbound_claim code

Changed files

  • src/auto-reply/reply/get-reply.ts (modified, +26/-0)
  • src/plugins/hooks.before-agent-reply.test.ts (added, +123/-0)
  • src/plugins/hooks.ts (modified, +21/-0)
  • src/plugins/types.ts (modified, +21/-0)

PR #30329: feat(privacy): add privacy detection and replacement filter for LLM traffic

Description (problem / solution / changelog)

Summary

  • Add a complete privacy filter pipeline that detects 50+ types of sensitive information (emails, phone numbers, API keys, credentials, PII, etc.) in text before sending to LLM, replaces them with format-preserving fake values, and restores originals in LLM responses
  • Support user-defined custom detection rules via JSON5 config files, enabling domain-specific patterns (e.g. employee IDs, regional phone formats), rule overrides, and selective disabling
  • Integrate privacy filtering into the LLM runner (prompt filtering + response restoration) and log redaction pipeline

Motivation

When users interact with LLM through OpenClaw, their prompts and context may contain sensitive information such as API keys, passwords, phone numbers, ID numbers, and other PII. This data is sent to external LLM providers, creating privacy risks. This module provides transparent, automatic privacy protection by:

  1. Detecting sensitive content using regex patterns and keyword matching with contextual validation
  2. Replacing detected content with format-preserving fake values so the LLM can still understand semantic context
  3. Restoring originals in LLM responses so the user sees correct information
  4. Persisting mappings with AES-256-GCM encryption for session continuity

Key Design Decisions

  • Format-preserving replacement: Fake values maintain the same format (e.g. email → email, phone → phone) so LLM responses remain coherent
  • Session-scoped idempotency: Same original text always maps to the same replacement within a session
  • Custom rules via JSON5: Users can extend/override/disable built-in rules without modifying source code
  • ReDoS prevention: User-provided regex patterns are validated for nested quantifiers and length limits
  • Named validator registry: Solves the problem of JSON not supporting function serialization for complex validations

Test plan

  • 167 tests across 8 test files all passing
  • Core detection: all 64 enabled rule types have coverage (rules.test.ts)
  • Replacement round-trip: filter → restore produces original text (stream-wrapper.test.ts)
  • Custom rules: validation, merge, disable, template expansion, regex safety (custom-rules.test.ts)
  • Encrypted persistence: save/load/TTL expiry (mapping-store.test.ts)
  • Config schema: Zod validation for privacy config (privacy-config.test.ts)
  • No regressions in existing test suite

This contribution was developed with AI assistance (Claude, Codex).

Changed files

  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +17/-0)
  • src/config/privacy-config.test.ts (added, +37/-0)
  • src/config/types.base.ts (modified, +18/-0)
  • src/config/types.openclaw.ts (modified, +8/-1)
  • src/config/zod-schema.ts (modified, +27/-0)
  • src/logging/redact.test.ts (modified, +11/-1)
  • src/logging/redact.ts (modified, +85/-9)
  • src/privacy/README.en.md (added, +508/-0)
  • src/privacy/README.md (added, +506/-0)
  • src/privacy/custom-rules.test.ts (added, +472/-0)
  • src/privacy/custom-rules.ts (added, +295/-0)
  • src/privacy/detector.test.ts (added, +212/-0)
  • src/privacy/detector.ts (added, +353/-0)
  • src/privacy/index.ts (added, +41/-0)
  • src/privacy/mapping-store.test.ts (added, +149/-0)
  • src/privacy/mapping-store.ts (added, +285/-0)
  • src/privacy/replacer.test.ts (added, +140/-0)
  • src/privacy/replacer.ts (added, +314/-0)
  • src/privacy/rules.test.ts (added, +167/-0)
  • src/privacy/rules.ts (added, +518/-0)
  • src/privacy/stream-wrapper.test.ts (added, +140/-0)
  • src/privacy/stream-wrapper.ts (added, +359/-0)
  • src/privacy/types.ts (added, +183/-0)

PR #45619: fix(privacy): harden stream filter and address review feedback

Description (problem / solution / changelog)

Summary

  • Problem: the initial privacy filter implementation had gaps that left secrets exposed in certain message paths and could produce unrestorable placeholder cascades.
  • Why it matters: unfiltered toolResult messages, systemPrompt, and double-filtered prompts can leak sensitive data to external LLM providers or show garbled placeholders to users.
  • What changed: removed redundant filterPrompt call to prevent cascaded mappings; added filtering for toolResult-role messages and systemPrompt; hardened mapping-store writes, log redaction config, custom rule validation, and TTL cleanup.
  • What did NOT change (scope boundary): no new features, config keys, network calls, or auth changes; only hardens existing privacy filter paths.

Change Type (select all)

  • Bug fix
  • Security hardening

Scope (select all touched areas)

  • Gateway / orchestration
  • Memory / storage

Linked Issue/PR

  • Related #45619 (addresses bot review feedback)

User-visible / Behavior Changes

  • toolResult messages are now privacy-filtered before being sent to LLM providers.
  • systemPrompt is now privacy-filtered before provider dispatch.
  • Prompt text is no longer double-filtered, preventing placeholder leakage.
  • Log redaction now respects privacy.enabled and privacy.rules config.
  • Expired mapping entries are cleaned up on session startup per configured TTL.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? Yes
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation:

Secrets handling improved: more message types are now filtered (toolResult, systemPrompt) and double-filtering is eliminated. Risk is low — these are strictly additive hardening changes with no new attack surface.

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22 + Vitest
  • Model/provider: N/A (unit tests)

Steps

  1. Enable privacy filtering in config.
  2. Send a message containing sensitive data (e.g. email, API key) that triggers a tool call.
  3. Observe that toolResult content and systemPrompt are filtered before reaching the LLM provider.

Expected

  • All message roles (user, assistant, toolResult) are filtered.
  • systemPrompt is filtered.
  • No cascaded/double placeholders appear in user-visible output.
  • Expired mappings are cleaned up on startup.

Actual

  • Matched expected behavior across 153 unit tests.

Evidence

  • Failing test/log before + passing after
 Test Files  6 passed (6)
      Tests  153 passed (153)
   Duration  969ms

Human Verification (required)

  • Verified scenarios: ran full privacy test suite locally; confirmed filterMessages handles all three message roles; confirmed filterPrompt removal eliminates cascaded mappings; confirmed TTL cleanup runs on context init.
  • Edge cases checked: malformed custom rule keywords (non-string), non-text content blocks in toolResult, empty/missing systemPrompt.
  • What you did not verify: full end-to-end with live LLM provider across all channels.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: set privacy.enabled to false.
  • Files/config to restore: revert this single commit.
  • Known bad symptoms reviewers should watch for: placeholder leakage in model output, missing tool results in conversations.

Risks and Mitigations

  • Risk: filtering toolResult messages could over-redact legitimate tool output that resembles sensitive patterns.
    • Mitigation: same detection rules and false-positive suppression as user/assistant messages; no new rule types added.

Changed files

  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +24/-0)
  • src/config/privacy-config.test.ts (added, +37/-0)
  • src/config/types.base.ts (modified, +26/-0)
  • src/config/types.openclaw.ts (modified, +8/-1)
  • src/config/zod-schema.ts (modified, +27/-0)
  • src/logging/redact.test.ts (modified, +11/-1)
  • src/logging/redact.ts (modified, +85/-9)
  • src/privacy/README.en.md (added, +508/-0)
  • src/privacy/README.md (added, +506/-0)
  • src/privacy/custom-rules.test.ts (added, +472/-0)
  • src/privacy/custom-rules.ts (added, +295/-0)
  • src/privacy/detector.test.ts (added, +212/-0)
  • src/privacy/detector.ts (added, +353/-0)
  • src/privacy/index.ts (added, +41/-0)
  • src/privacy/mapping-store.test.ts (added, +149/-0)
  • src/privacy/mapping-store.ts (added, +285/-0)
  • src/privacy/replacer.test.ts (added, +140/-0)
  • src/privacy/replacer.ts (added, +314/-0)
  • src/privacy/rules.test.ts (added, +167/-0)
  • src/privacy/rules.ts (added, +518/-0)
  • src/privacy/stream-wrapper.test.ts (added, +140/-0)
  • src/privacy/stream-wrapper.ts (added, +359/-0)
  • src/privacy/types.ts (added, +183/-0)

PR #45783: feat(privacy): add privacy detection and replacement filter for LLM traffic

Description (problem / solution / changelog)

Summary

  • Problem: prompts and tool context sent to external LLM providers can contain secrets, credentials, phone numbers, and other PII — OpenClaw has log redaction but nothing that scrubs the actual content before it leaves the gateway.
  • Why it matters: users relying on third-party or proxy-hosted LLM services have no automatic protection against sensitive data exposure in outbound API calls.
  • What changed: adds a complete privacy filter pipeline — detection (50+ rule types), format-preserving replacement, encrypted mapping persistence, bidirectional stream filtering (outbound scrub + inbound restore), custom user rules via JSON5, and log redaction integration.
  • What did NOT change (scope boundary): no auth, channel, provider selection, or network destination changes; only transforms eligible text at the LLM boundary and log-redaction path.

Change Type (select all)

  • Feature
  • Security hardening

Scope (select all touched areas)

  • Gateway / orchestration
  • Memory / storage

Linked Issue/PR

  • Related #37815
  • Related #43830
  • Related #44195
  • Supersedes #30329
  • Supersedes #45619

User-visible / Behavior Changes

  • New privacy config section: privacy.enabled (default true), privacy.rules ("basic" | "extended" | custom path), privacy.encryption, privacy.mappings (TTL, store path), privacy.log.
  • Sensitive content (emails, phone numbers, API keys, tokens, Chinese PII, etc.) is automatically replaced with format-preserving placeholders before outbound LLM requests.
  • Original values are restored in returned model text where mappings exist.
  • All message roles (user, assistant, toolResult) and systemPrompt are filtered.
  • Log redaction now uses the same privacy detector for broader coverage, respecting privacy.enabled and privacy.rules.
  • Users can define custom detection rules via JSON5 config files.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? Yes
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation:

This PR reads outbound/inbound LLM text and stores replacement mappings locally. Risk is limited by: AES-256-GCM encrypted-at-rest mapping storage with atomic write-then-rename, file locking for concurrent access, opt-out via privacy.enabled: false, ReDoS safety validation for custom regex rules, and no expansion of command execution or network access.

Repro + Verification

Environment

  • OS: macOS (Darwin 25.3.0)
  • Runtime/container: Node 22 + Vitest 4.0.18
  • Model/provider: N/A (unit tests)

Steps

  1. Enable privacy filtering (default on).
  2. Send text containing sensitive values through the LLM filter path.
  3. Observe outbound text has placeholders, inbound text is restored.

Expected

  • Sensitive values replaced before LLM egress across all message roles.
  • Returned model text restores original values where mappings exist.
  • Log redaction covers the same sensitive patterns.
  • Custom rules are validated (including ReDoS safety) and merged safely.
  • Mapping persistence is crash-safe (atomic writes) and encrypted.

Actual

  • Matched expected behavior across 168 unit tests covering all components.

Evidence

  • Failing test/log before + passing after

Full test suite output (8 test files, 168 tests, 0 failures):

 ✓ src/privacy/mapping-store.test.ts    (8 tests)   426ms
 ✓ src/privacy/stream-wrapper.test.ts   (11 tests)  478ms
 ✓ src/privacy/rules.test.ts            (65 tests)  12ms
 ✓ src/privacy/detector.test.ts         (26 tests)  11ms
 ✓ src/privacy/custom-rules.test.ts     (32 tests)  8ms
 ✓ src/logging/redact.test.ts           (13 tests)  9ms
 ✓ src/privacy/replacer.test.ts         (11 tests)  4ms
 ✓ src/config/privacy-config.test.ts    (2 tests)   3ms

 Test Files  8 passed (8)
      Tests  168 passed (168)
   Duration  1.30s

Test coverage includes:

  • detector.test.ts (26 tests): regex + keyword matching, contextual validation, false-positive reduction, custom rule loading
  • replacer.test.ts (11 tests): format-preserving replacement, session-scoped idempotency, restore round-trip
  • mapping-store.test.ts (8 tests): AES-256-GCM encrypted persistence, atomic writes, file locking, TTL cleanup, concurrent access
  • stream-wrapper.test.ts (11 tests): outbound message filtering (user/assistant/toolResult/systemPrompt), inbound stream restoration (text + tool-call arguments), privacy config gating
  • rules.test.ts (65 tests): all 50+ built-in detection rules across email, phone, API key, token, PII, Chinese ID patterns
  • custom-rules.test.ts (32 tests): JSON5 rule parsing, validation, ReDoS safety rejection, rule merging, preset extension
  • redact.test.ts (13 tests): log redaction integration, privacy-enabled gating, configurable rulesets
  • privacy-config.test.ts (2 tests): Zod schema validation for privacy config section

Human Verification (required)

  • Verified scenarios: ran full test suite locally; confirmed detection, replacement/restoration round-trip, encrypted persistence, custom rule loading/validation, config schema parsing, log redaction integration, and all message role filtering.
  • Edge cases checked: heuristic false-positive suppression (bare_password, high_entropy_string skipped in stream filter), malformed custom rule keywords (non-string rejection), mapping TTL expiry and cleanup, non-fatal persistence-failure handling, empty/missing systemPrompt, non-text content blocks in toolResult, concurrent mapping store access with file locking.
  • What you did not verify: full end-to-end validation with live LLM providers across all channel combinations; performance profiling under high-throughput production traffic.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? Yes (new optional privacy config section; defaults to enabled)
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: set privacy.enabled to false in config.
  • Files/config to restore: revert this single commit.
  • Known bad symptoms reviewers should watch for: placeholder leakage in model output (pf_* tokens visible to users), over-redaction of legitimate content, mapping persistence failures in logs.

Risks and Mitigations

  • Risk: false positives could replace non-sensitive strings and degrade model context.
    • Mitigation: contextual validation, heuristic-type suppression in stream filter, custom disable/override support.
  • Risk: custom regex rules could be catastrophically slow (ReDoS).
    • Mitigation: compileSafeRegex validation rejects ambiguous alternation under repetition, repeated .* groups, and other unsafe patterns before loading.
  • Risk: local mapping persistence could expose originals if stored insecurely.
    • Mitigation: AES-256-GCM encryption, owner-only file permissions, atomic write-then-rename, session-scoped TTL cleanup.

Changed files

  • src/agents/btw.test.ts (modified, +55/-0)
  • src/agents/btw.ts (modified, +19/-1)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +119/-25)
  • src/config/privacy-config.test.ts (added, +37/-0)
  • src/config/types.base.ts (modified, +16/-0)
  • src/config/types.openclaw.ts (modified, +8/-1)
  • src/config/zod-schema.ts (modified, +27/-0)
  • src/logging/redact.test.ts (modified, +233/-2)
  • src/logging/redact.ts (modified, +184/-9)
  • src/privacy/README.en.md (added, +508/-0)
  • src/privacy/README.md (added, +506/-0)
  • src/privacy/custom-rules.test.ts (added, +685/-0)
  • src/privacy/custom-rules.ts (added, +441/-0)
  • src/privacy/detector.test.ts (added, +256/-0)
  • src/privacy/detector.ts (added, +365/-0)
  • src/privacy/index.ts (added, +41/-0)
  • src/privacy/mapping-store.test.ts (added, +198/-0)
  • src/privacy/mapping-store.ts (added, +293/-0)
  • src/privacy/replacer.test.ts (added, +185/-0)
  • src/privacy/replacer.ts (added, +370/-0)
  • src/privacy/rules.test.ts (added, +185/-0)
  • src/privacy/rules.ts (added, +521/-0)
  • src/privacy/stream-wrapper.test.ts (added, +663/-0)
  • src/privacy/stream-wrapper.ts (added, +759/-0)
  • src/privacy/types.ts (added, +183/-0)
  • src/tts/tts-core.ts (modified, +18/-4)
  • src/tts/tts.test.ts (modified, +36/-0)

Code Example

{
  outputSanitization: {
    enabled: true,              // Master switch
    mode: "auto",               // auto | strict | off
    scope: "all",               // all | tools | responses
    redactPatterns: [
      // Built-in patterns (can be overridden)
      "\\bAKIA[0-9A-Z]{16}\\b",                    // AWS Access Key
      "\\bsk-[a-zA-Z0-9]{48}\\b",                  // OpenAI API Key
      "\\bsk-ant-[a-zA-Z0-9-]{80,}\\b",            // Anthropic Key
      "\\bghp_[a-zA-Z0-9]{36}\\b",                 // GitHub Token
      "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b",  // Email
      "\\b\\d{3}-\\d{2}-\\d{4}\\b",                // SSN
      "\\b\\d{4}-\\d{4}-\\d{4}-\\d{4}\\b",         // Credit Card
      "password\\s*[:=]\\s*['\"]([^'\"]+)['\"]",    // Passwords
      "-----BEGIN.*PRIVATE KEY-----",              // Private Keys
      "eyJ[a-zA-Z0-9_-]+\\.eyJ[a-zA-Z0-9_-]+",     // JWT Tokens
      "<db-scheme>://[^\\s]+",                     // Database URLs
    ],
    exceptions: {
      // Allow specific patterns in certain contexts
      allowInCodeBlocks: false,    // Don't redact in code blocks
      allowInQuotes: false,        // Don't redact in quoted strings
      allowList: [],               // Specific patterns to never redact
    },
    reporting: {
      enabled: true,              // Log redaction events
      level: "warn",              // log level for redactions
      includeOriginal: false,     // Never log original sensitive values
    }
  }
}

---

Agent ResponseSanitization MiddlewareChannel Delivery

---

{
  agents: {
    defaults: {
      postProcess: [
        {
          type: "output-sanitizer",
          config: { mode: "auto" }
        }
      ]
    }
  }
}

---

const patterns = {
  awsAccessKey: /\bAKIA[0-9A-Z]{16}\b/g,
  openAIKey: /\bsk-[a-zA-Z0-9]{48}\b/g,
  anthropicKey: /\bsk-ant-[a-zA-Z0-9-]{80,}\b/g,
  githubToken: /\bghp_[a-zA-Z0-9]{36}\b/g,
  email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
  ssn: /\b\d{3}-\d{2}-\d{4}\b/g,
  creditCard: /\b\d{4}-\d{4}-\d{4}-\d{4}\b/g,
  password: /password\s*[:=]\s*['"]([^'"]+)['"]/gi,
  privateKey: /-----BEGIN.*PRIVATE KEY-----/g,
  jwt: /eyJ[a-zA-Z0-9_-]+\.eyJ[a-zA-Z0-9_-]+/g,
  databaseUrl: /<(postgres|mysql|mongodb|redis):\/\/[^\s]+>/g,
};

---

{
  "event": "output_sanitized",
  "timestamp": "2026-03-12T07:35:00Z",
  "sessionKey": "agent:main:webchat",
  "redactions": [
    {
      "type": "api_key",
      "pattern": "openai_key",
      "line": 15,
      "action": "replaced_with_redacted"
    },
    {
      "type": "email",
      "pattern": "email",
      "line": 28,
      "action": "masked_local_part"
    }
  ],
  "totalRedactions": 2
}

---

{
  agents: {
    defaults: {
      outputSanitization: {
        enabled: true,
        mode: "auto"
      }
    },
    list: [
      {
        id: "main",
        outputSanitization: {
          enabled: true,
          mode: "strict"  // Override for main agent
        }
      },
      {
        id: "public",
        outputSanitization: {
          enabled: true,
          mode: "strict",
          scope: "all"  // Maximum security for public agent
        }
      }
    ]
  }
}
RAW_BUFFERClick to expand / collapse

Summary

Add automatic output sanitization to prevent accidental leakage of sensitive information (API keys, tokens, passwords, PII) in agent responses, regardless of whether the requester is a tester or administrator.

Problem to solve

Currently, OpenClaw has an output-sanitizer skill that provides guidance on redacting sensitive information, but it is not automatically applied to agent outputs. This means:

  1. Security Risk: When agents read configuration files (like openclaw.json) or other sensitive files, they may return complete content including API keys, tokens, and passwords
  2. Testing Issues: Test personnel requesting full file information receive unredacted sensitive data
  3. No Automatic Protection: There is no configuration option to enable automatic output filtering
  4. Manual Dependency: Users must manually remember to apply sanitization rules

Proposed solution

1. Configuration Option

Add a new configuration section to enable automatic output sanitization:

{
  outputSanitization: {
    enabled: true,              // Master switch
    mode: "auto",               // auto | strict | off
    scope: "all",               // all | tools | responses
    redactPatterns: [
      // Built-in patterns (can be overridden)
      "\\bAKIA[0-9A-Z]{16}\\b",                    // AWS Access Key
      "\\bsk-[a-zA-Z0-9]{48}\\b",                  // OpenAI API Key
      "\\bsk-ant-[a-zA-Z0-9-]{80,}\\b",            // Anthropic Key
      "\\bghp_[a-zA-Z0-9]{36}\\b",                 // GitHub Token
      "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b",  // Email
      "\\b\\d{3}-\\d{2}-\\d{4}\\b",                // SSN
      "\\b\\d{4}-\\d{4}-\\d{4}-\\d{4}\\b",         // Credit Card
      "password\\s*[:=]\\s*['\"]([^'\"]+)['\"]",    // Passwords
      "-----BEGIN.*PRIVATE KEY-----",              // Private Keys
      "eyJ[a-zA-Z0-9_-]+\\.eyJ[a-zA-Z0-9_-]+",     // JWT Tokens
      "<db-scheme>://[^\\s]+",                     // Database URLs
    ],
    exceptions: {
      // Allow specific patterns in certain contexts
      allowInCodeBlocks: false,    // Don't redact in code blocks
      allowInQuotes: false,        // Don't redact in quoted strings
      allowList: [],               // Specific patterns to never redact
    },
    reporting: {
      enabled: true,              // Log redaction events
      level: "warn",              // log level for redactions
      includeOriginal: false,     // Never log original sensitive values
    }
  }
}

2. Modes of Operation

auto (Recommended)

  • Apply sanitization to all outputs
  • Use smart detection (context-aware)
  • Preserve code structure when possible
  • Mask sensitive values while maintaining readability

strict

  • Aggressive redaction
  • Any potential sensitive pattern is redacted
  • Higher false positive rate, maximum security

off

  • Disable automatic sanitization
  • Manual application only (current behavior)

3. Scope Options

all

  • Sanitize all agent outputs (responses, tool results, file reads)

tools

  • Only sanitize tool outputs (file reads, exec results, etc.)

responses

  • Only sanitize final agent responses

4. Implementation Approach

Option A: Middleware Layer (Recommended)

Add a sanitization middleware in the agent output pipeline:

Agent Response → Sanitization Middleware → Channel Delivery

Pros:

  • Centralized control
  • Consistent behavior across all channels
  • Easy to configure and debug
  • Performance impact minimal

Cons:

  • Requires changes to core agent loop

Option B: Skill-Based Auto-Application

Enhance the output-sanitizer skill to automatically run:

Pros:

  • Leverages existing skill infrastructure
  • Less invasive code changes
  • Skill can be updated independently

Cons:

  • Skills run after agent response
  • May not catch all output paths
  • More complex to maintain

Option C: Post-Processing Hook

Add a post-processing hook that can be configured:

{
  agents: {
    defaults: {
      postProcess: [
        {
          type: "output-sanitizer",
          config: { mode: "auto" }
        }
      ]
    }
  }
}

Pros:

  • Flexible and extensible
  • Can add multiple processors
  • Easy to enable/disable per agent

Cons:

  • Adds complexity to configuration
  • Requires new hook infrastructure

5. Redaction Strategy

Pattern Matching

Use regex patterns to detect sensitive data:

const patterns = {
  awsAccessKey: /\bAKIA[0-9A-Z]{16}\b/g,
  openAIKey: /\bsk-[a-zA-Z0-9]{48}\b/g,
  anthropicKey: /\bsk-ant-[a-zA-Z0-9-]{80,}\b/g,
  githubToken: /\bghp_[a-zA-Z0-9]{36}\b/g,
  email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
  ssn: /\b\d{3}-\d{2}-\d{4}\b/g,
  creditCard: /\b\d{4}-\d{4}-\d{4}-\d{4}\b/g,
  password: /password\s*[:=]\s*['"]([^'"]+)['"]/gi,
  privateKey: /-----BEGIN.*PRIVATE KEY-----/g,
  jwt: /eyJ[a-zA-Z0-9_-]+\.eyJ[a-zA-Z0-9_-]+/g,
  databaseUrl: /<(postgres|mysql|mongodb|redis):\/\/[^\s]+>/g,
};

Replacement Strategy

Conservative (Default):

  • Replace with [REDACTED]
  • Preserve context and structure
  • Example: apiKey: "sk-abc123..."apiKey: "[REDACTED]"

Masked:

  • Show partial value
  • Example: sk-abc123...xyzsk-****...****

Hashed:

  • Replace with hash (for debugging)
  • Example: sk-abc123...[REDACTED:sha256:abc123...]

6. Reporting and Logging

When redaction occurs, log a structured event:

{
  "event": "output_sanitized",
  "timestamp": "2026-03-12T07:35:00Z",
  "sessionKey": "agent:main:webchat",
  "redactions": [
    {
      "type": "api_key",
      "pattern": "openai_key",
      "line": 15,
      "action": "replaced_with_redacted"
    },
    {
      "type": "email",
      "pattern": "email",
      "line": 28,
      "action": "masked_local_part"
    }
  ],
  "totalRedactions": 2
}

7. Per-Agent Configuration

Allow per-agent overrides:

{
  agents: {
    defaults: {
      outputSanitization: {
        enabled: true,
        mode: "auto"
      }
    },
    list: [
      {
        id: "main",
        outputSanitization: {
          enabled: true,
          mode: "strict"  // Override for main agent
        }
      },
      {
        id: "public",
        outputSanitization: {
          enabled: true,
          mode: "strict",
          scope: "all"  // Maximum security for public agent
        }
      }
    ]
  }
}

8. Backward Compatibility

  • Default to enabled: false to maintain current behavior
  • Add migration path in openclaw doctor
  • Document breaking changes clearly
  • Provide upgrade guide

Alternatives considered

No response

Impact

Priority: High Complexity: Medium Risk: Low (backward compatible) Impact: High (security improvement)

Evidence/examples

No response

Additional information

No response

extent analysis

Problem Summary

Add a global output‑sanitizer middleware that automatically redacts secrets in every agent response (tool results, final messages, etc.) and make it configurable via outputSanitization in openclaw.json.

Root Cause Analysis

  • The existing output‑sanitizer skill is only invoked manually.
  • No hook in the response pipeline forces its execution, so any raw file read or tool output can leak secrets.

Fix Plan

1. Add Config Schema

// src/config/schema.ts
export interface OutputSanitizationConfig {
  enabled: boolean;          // master switch
  mode: 'auto' | 'strict' | 'off';
  scope: 'all' | 'tools' | 'responses';
  redactPatterns: string[];  // regex strings
  exceptions?: {
    allowInCodeBlocks?: boolean;
    allowInQuotes?: boolean;
    allowList?: string[];
  };
  reporting?: {
    enabled: boolean;
    level: 'info' | 'warn' | 'error';
    includeOriginal: boolean;
  };
}

Add defaults (mirroring the JSON in the issue) in defaultConfig.ts.
Make the top‑level config load it:

// src/config/load.ts
import { OutputSanitizationConfig } from './schema';
const defaultSanitizer: OutputSanitizationConfig = {
  enabled: false,
  mode: 'auto',
  scope: 'all',
  redactPatterns: [
    '\\bAKIA[0-9A-Z]{16}\\b',
    '\\bsk-[a-zA-Z0-9]{48}\\b',
    '\\bsk-ant-[a-zA-Z0-9-]{80,}\\b',
    '\\bghp_[a-zA-Z0-9]{36}\\b',
    '\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}\\b',
    '\\b\\d{

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING