claude-code - 💡(How to fix) Fix Default auto-memory: prohibition-framed feedback memories may prime the model toward the patterns they aim to suppress [1 participants]

claude-code2026-05-10 03:29:15

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#57747•Fetched 2026-05-11 03:26:28

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ashahawy2

Participants

ashahawy2

Timeline (top)

labeled ×2

Root Cause

Notably, the system already documents this exact principle for runtime CLAUDE.md authorship — describe violations abstractly, don't quote the literal banned phrase, because quoting primes the model. That insight just hasn't been generalised to the memory-writing layer.

RAW_BUFFERClick to expand / collapse

Observation

The default auto-memory system creates feedback memories when the user corrects the model. The natural way to encode "Claude did X wrong" is "don't do X" — so over time the MEMORY.md index fills with prohibition-led entries that load into context every session.

This can prime the model toward the patterns it's trying to suppress. Transformer attention has no negation gate: the tokens of a banned phrase activate whether they're prefixed with "DON'T", "NEVER", or "BAD example:". Show the model "don't say Let me check…" and it can produce "Let me wait for…" — same Let me… pattern, primed by the warning.

Concrete case

After ~6 weeks of accumulating corrections in a long-running project, I had 32 feedback memories. The two rules I broke most often — "diagnose, don't fix" and "never bypass the dev system" — were also the most prohibition-heavy in framing. Less prohibition-heavy memories ("always allow task tool access", "verify CLI flags") were honoured consistently across sessions.

Migrating the 32 files from prohibitive titles + lead sentences ("Never use Haiku", "Don't skip systematic-debugging", "No self-authorize #direct-edit") to prescriptive ones ("Use Opus or Sonnet", "Run systematic-debugging under deploy pressure", "User authorizes #direct-edit") is the natural fix. Incident detail (dates, IDs, what happened) goes into a Background section below the prescriptive instruction so the audit trail stays intact.

Why this isn't widely reported (best guesses)

Most users probably don't accumulate 32 feedback memories — the priming is dose-dependent.
When the model repeats a known-bad pattern, the read is "the model is unreliable", not "my feedback memory's framing put the bad pattern back into attention." The chain takes a specific mental model to spot.
The default writing trigger (user just corrected the model) makes the corpus skew prohibitive by construction. The system-prompt guidance to "lead with the rule itself" doesn't distinguish positive from negative framing.

Suggested fix in the auto-memory system prompt

In the <type><name>feedback</name> section of the memory instructions:

Nudge the title and lead sentence toward the desired behaviour, not the prohibition. The example bodies already use a **How to apply:** block with prescriptive bullets — extend that pattern to the title.
Add a one-line note on why prescriptive framing matters (the priming / attention-window mechanism) so the model self-applies the rule when writing new memories.
Generalise the existing "describe negative examples abstractly" rule (currently aimed at runtime CLAUDE.md authorship) to memory writing itself.

A small change to the system-prompt <body_structure> / <when_to_save> guidance would prevent the corpus skew without requiring users to discover the issue manually.

Scope

Not a request to remove the bug registry or stop recording incidents — those serve real audit purposes. The thing to flip is the behaviour-shaping layer (the feedback type), not the descriptive layer (project notes, debugging history).

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #logging issue #authentication issue #prompt issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Default auto-memory: prohibition-framed feedback memories may prime the model toward the patterns they aim to suppress [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Observation

Concrete case

Why this isn't widely reported (best guesses)

Suggested fix in the auto-memory system prompt

Scope

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Default auto-memory: prohibition-framed feedback memories may prime the model toward the patterns they aim to suppress [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Observation

Concrete case

Why this isn't widely reported (best guesses)

Suggested fix in the auto-memory system prompt

Scope

Still need to ship something?

RELATED_DISCOVERY

TRENDING