openclaw - 💡(How to fix) Fix Feature Request: PII Redaction Layer [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#56911Fetched 2026-04-08 01:46:09
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

Code Example

pii:
  enabled: true
  categories:
    phone: true
    email: true
    address: true
    financial: true    # SSN, credit cards, bank accounts
    api_keys: true
  exclude_providers:
    - local/ollama     # no need to redact for local models
RAW_BUFFERClick to expand / collapse

Feature Request: PII Redaction Layer

Problem Statement

OpenClaw's power comes from deep integration with users' personal data — emails, calendars, files, contacts, financial information. All of this context is sent unredacted to LLM providers. Users who follow the recommended patterns (USER.md with personal details, MEMORY.md with life context) are sending sensitive information to third-party APIs with no automatic protection.

This is a trust and adoption blocker, especially for privacy-conscious users and enterprise deployments.

Proposed Solution

Pre-Processing Redaction Layer

  • Sits between the agent and the LLM provider in the gateway.
  • Scans all outgoing context for PII patterns.
  • Replaces detected PII with reversible placeholders: [PHONE_1], [EMAIL_2], [ADDRESS_1].

Supported Categories

  • Phone numbers (international formats)
  • Email addresses
  • Physical addresses
  • Social Security Numbers / national IDs
  • Credit card numbers
  • Bank account numbers
  • Dates of birth
  • IP addresses
  • API keys and tokens

Reversible Mapping

  • Local-only mapping file stores placeholder → original pairs.
  • Restoration happens at tool execution time: when the agent needs to send an actual email or make a call, the real value is injected.
  • Mapping file is session-scoped and never sent to providers.

Configuration

pii:
  enabled: true
  categories:
    phone: true
    email: true
    address: true
    financial: true    # SSN, credit cards, bank accounts
    api_keys: true
  exclude_providers:
    - local/ollama     # no need to redact for local models

Prior Art

A community skill (pii-redaction) already exists with regex-based detection and redact/restore modes. Promoting this to a core gateway feature would increase adoption and ensure consistent protection.

User Impact

  • Privacy: Users can share personal context freely, knowing sensitive data is scrubbed before reaching providers.
  • Trust: Lowers the barrier for users hesitant to put real personal data in USER.md and MEMORY.md.
  • Compliance: Helps with GDPR, CCPA, and other data protection requirements.
  • Enterprise: A prerequisite for organizational deployments where data handling policies are strict.

Technical Considerations

  • Detection accuracy: Regex-based detection is fast but has false positives/negatives. Hybrid approach (regex + lightweight NER) improves accuracy.
  • Performance: Must add <50ms latency per message. Regex scanning is typically <5ms.
  • Restoration timing: Critical to restore PII at the right moment — before tool execution (sending email), not before LLM processing.
  • Context coherence: The LLM must still be able to reason about redacted data structurally ("call [PHONE_1]" should work).
  • Existing skill: The pii-redaction skill in the community provides a solid starting point for the regex patterns and restore logic.

Priority

MEDIUM. Important for trust, adoption, and compliance. Not blocking current users but would significantly expand the addressable user base. Low implementation cost given the existing community skill.

extent analysis

Fix Plan

To implement the PII redaction layer, follow these steps:

  • Step 1: Create a regex-based detection system
    • Utilize the existing pii-redaction community skill as a starting point
    • Implement regex patterns for each supported category (phone numbers, email addresses, physical addresses, etc.)
    • Example regex pattern for phone numbers: \+?1?\s*\(?([0-9]{3})\)?[-.\s]?([0-9]{3})[-.\s]?([0-9]{4})
  • Step 2: Develop a reversible mapping system
    • Create a local-only mapping file to store placeholder → original pairs
    • Implement a function to replace detected PII with reversible placeholders (e.g., [PHONE_1])
    • Example code snippet in Python:

import re

def replace_pii(text, category, placeholder): # Replace detected PII with placeholder text = re.sub(r'+?1?\s*(?([0-9]{3}))?[-.\s]?([0-9]{3})[-.\s]?([0-9]{4})', placeholder, text) return text

def restore_pii(text, mapping): # Restore original PII from mapping for placeholder, original in mapping.items(): text = text.replace(placeholder, original) return text

*   **Step 3: Integrate the PII redaction layer into the gateway**
    *   Add the PII redaction layer between the agent and the LLM provider
    *   Configure the layer to scan outgoing context for PII patterns and replace detected PII with reversible placeholders
    *   Example configuration in YAML:
    ```yml
pii:
  enabled: true
  categories:
    phone: true
    email: true
    address: true
    financial: true
    api_keys: true
  exclude_providers:
    - local/ollama
  • Step 4: Implement restoration timing and context coherence
    • Restore original PII at the right moment (before tool execution)
    • Ensure the LLM can still reason about redacted data structurally

Verification

To verify the fix, test the PII redaction layer with various input scenarios, including:

  • Phone numbers in different formats
  • Email addresses
  • Physical addresses
  • Financial information (SSN, credit cards, bank accounts)
  • API keys and tokens

Check that the layer correctly replaces detected PII with reversible placeholders and restores the original PII at the right moment.

Extra Tips

  • Consider using a hybrid approach (regex + lightweight NER) to improve detection accuracy
  • Monitor performance and adjust the implementation as needed to ensure <50

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING