claude-code - 💡(How to fix) Fix Proposal: On-device AI redaction primitive for Claude Code memory [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#53325Fetched 2026-04-26 05:18:39
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
labeled ×3

Code Example

input:   a memory item (text, structured, or mixed)
process: pattern detection + AI semantic classification + user gate
output:  (1) the original (unchanged, stays local)
         (2) a sanitized variant with sensitive fragments replaced
         (3) a redaction diff: what was removed, why, where
RAW_BUFFERClick to expand / collapse

The missing primitive

Claude Code today has no built-in capability to take a memory item, classify which of its fragments are sensitive, and produce a sanitized variant alongside the original. This is a missing primitive — and its absence is the actual blocker behind every request to share, export, or move memory beyond the local machine.

This proposal is for the primitive itself, as a standalone capability. What it unlocks is secondary and is covered briefly at the end.

What the primitive does

A local, on-device function that runs against any memory write the user has marked as eligible:

input:   a memory item (text, structured, or mixed)
process: pattern detection + AI semantic classification + user gate
output:  (1) the original (unchanged, stays local)
         (2) a sanitized variant with sensitive fragments replaced
         (3) a redaction diff: what was removed, why, where

The user is the final authority. The primitive surfaces every redaction as a reviewable diff before the sanitized variant is allowed to leave the device.

Why Claude is uniquely qualified to provide this

Generic tools that move data — git, rsync, third-party sync utilities — operate on bytes. They cannot tell the difference between a public domain name and an internal hostname. They cannot recognize that sk-proj-... is an OpenAI key while sk-rule-001 is a user-defined identifier. They sync everything or nothing.

Claude can. The same intelligence that powers /security-review already understands the distinction between sensitive material and user-authored signal. This primitive is a productization of that capability.

No third-party integration can replicate this with comparable fidelity. The redaction step is genuinely a Claude-shaped problem.

The primitive in detail

Layer 1 — Pattern detection

Regular expressions for known shapes: API keys (provider-specific patterns), JWTs, IPv4/IPv6 addresses, email addresses, file paths (Windows and POSIX), credit card numbers, common secret prefixes (-----BEGIN, eyJhbGciOi, etc.).

Fast, deterministic, runs on every eligible write.

Layer 2 — Semantic classification

A Claude pass evaluates each pattern hit and each non-pattern string. Pattern hits that are demonstrably public (a documented IP, an example.com email) can be allowed through. Non-pattern strings that read as sensitive in context (a project codename, an internal hostname mentioned conversationally) get flagged.

Slower than Layer 1. Runs on items that pass the user-eligibility filter.

Layer 3 — User confirmation gate

On the first run for any new sync target, all detections are presented as a unified diff. The user accepts the redactions, marks specific items as safe (per-item override), or rejects the export entirely.

After this initial calibration, the primitive runs silently with periodic audit reminders ("47 redactions in the last 30 days — review them?").

What the primitive unlocks

The primitive itself is the focus of this issue. But naming the downstream capabilities it would enable is necessary context:

  • 3.1 Account-bound memory restoration on a new install — the user logs into their Anthropic account on a fresh machine and recovers their accumulated configuration, with sensitive material having never left the previous machine.
  • 3.2 User-to-user export of curated rules — a developer can share their hardened workflow conventions with a colleague without manually grepping for secrets.
  • 3.3 Team memory primitives — a future feature where organizations share an authored knowledge layer.
  • 3.4 Cross-product memory bridging — making accumulated Claude Code configuration available to claude.ai (and vice versa) without transferring sensitive fragments.

Each of these is a separate product question. The point of this issue is that all of them require the redaction primitive first.

Why this should be Anthropic-first-party

Three reasons:

  1. The classification step requires Claude. Generic tooling cannot distinguish sensitive fragments from user-authored content with comparable accuracy. The primitive is intrinsically tied to Anthropic's models.

  2. Identity is already established. The Anthropic account that powers claude.ai is the natural identity for any account-bound use of the primitive (capability 3.1 above). No new auth surface.

  3. Trust posture. Users who already trust Anthropic with claude.ai conversations are making a smaller trust delta when they enable a sanitized export — adding a redacted layer to an existing relationship, not opening a new trust boundary with a third party.

Implementation sketch

  • The primitive runs entirely on the local agent.
  • Both versions of every memory item are retained on disk: the original (untouched) and the sanitized variant (alongside).
  • Sanitized variants are tagged with the redaction diff so the user can always inspect what was removed.
  • A simple API: redact(memory_item) → { original, sanitized, diff }. Downstream features call this; the primitive itself does not move data anywhere.
  • Two encryption tiers should be available for any downstream feature that uses the primitive's output:
    • Tier 1 (default): server-side encrypted, parity with claude.ai trust model
    • Tier 2 (opt-in): end-to-end with user-held key derived from a recovery phrase

Acceptance Criteria

  • A local on-device function that accepts a memory item and returns { original, sanitized, diff }
  • Pattern detection layer covering at minimum: API keys, JWTs, IPs, emails, file paths, credit card numbers, common secret prefixes
  • Claude-driven semantic classification layer for context-dependent detection
  • User confirmation gate before any sanitized variant is consumed by a downstream feature
  • Per-item user override stored with the sanitized variant
  • Audit log of redactions, queryable from the agent
  • Public documentation explaining what the primitive considers sensitive

Open Questions

  1. API surface. Is the primitive exposed only internally to Anthropic-built downstream features, or also as a hook other tools can call?
  2. Performance budget. Layer 2 (semantic classification) has nontrivial latency. Run synchronously on memory write, or asynchronously with a queue?
  3. Versioning. When the classification logic improves, do existing sanitized variants get re-evaluated? Probably yes, with a user-visible audit prompt.
  4. Fallback when offline. Layer 2 requires a model. If the user is offline, does the primitive defer until reconnect, or fall back to Layer 1 only with a warning?
  5. Configurable strictness. Should users be able to choose "aggressive" vs "minimal" classification, or is there one canonical setting?

Relationship to existing issues

I have read the existing memory-sharing requests carefully. They are valid and important — but they describe what users want, not how to do it safely. None of them addresses the classification-and-sanitization step that any safe implementation requires. That is the gap this issue fills.

For reference (this issue does not duplicate any of them — it specifies the missing capability they all implicitly require):

  • #25739 — portable project memory
  • #28276 — configurable memory storage location
  • #39457 — global config sync across machines
  • #35985 — cross-device account-bound identity
  • #47299 — memory bridging between Claude Code and claude.ai

If this primitive ships, every one of those issues becomes implementable safely. Without it, they stay blocked on the privacy question.

extent analysis

TL;DR

Implement a local, on-device primitive that classifies and sanitizes sensitive fragments in memory items, providing a secure foundation for sharing, exporting, or moving memory beyond the local machine.

Guidance

  • Develop a three-layer approach: pattern detection, semantic classification using Claude's intelligence, and a user confirmation gate to ensure accurate and secure redaction of sensitive information.
  • Design the primitive to run entirely on the local agent, retaining both original and sanitized variants of memory items, and provide a simple API for downstream features to call.
  • Consider implementing configurable strictness levels for classification and versioning to handle improvements in classification logic.
  • Address open questions regarding API surface, performance budget, and fallback behavior when offline to ensure a robust and user-friendly implementation.

Example

def redact(memory_item):
    # Layer 1: Pattern detection
    detected_patterns = detect_patterns(memory_item)
    
    # Layer 2: Semantic classification
    classified_items = classify_items(detected_patterns)
    
    # Layer 3: User confirmation gate
    sanitized_variant, diff = get_user_confirmed_redaction(classified_items)
    
    return {"original": memory_item, "sanitized": sanitized_variant, "diff": diff}

Notes

The implementation should prioritize user trust and security, leveraging Claude's existing intelligence to accurately classify and redact sensitive information. The primitive's design should be flexible enough to accommodate various downstream features and use cases.

Recommendation

Apply the proposed primitive as a foundational capability, enabling safe and secure sharing, exporting, or moving of memory items. This will unlock various downstream features and use cases, such as account-bound memory restoration, user-to-user export of curated rules, and team memory primitives.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING