claude-code - 💡(How to fix) Fix Memory directives are loaded but not consistently honoured

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Claude Code's memory system loads per-project memory files into every session, including memory entries explicitly written by the operator to override the model's default behaviour. Visible to the model. Referenced in passing. Cited inline on occasion. Followed — when the captured preference happens to align with the model's trained default for the decision at hand.

When the captured preference is in tension with the trained default, the trained default wins, with a frequency that the existence of the memory entry does not appear to materially reduce.

This is not a general "model is too conservative" report. The general report is presumably already tracked. The specific issue is that the memory layer exists to fix exactly this class of mismatch, and observed behaviour suggests its effect is materially weaker than its persistent presence in the context window would imply.

See also #59514. The prior report describes a signal the model needs but does not have. This one describes a signal the model has but does not weight. Same operational consequence; complementary cause.

Root Cause

Claude Code's memory system loads per-project memory files into every session, including memory entries explicitly written by the operator to override the model's default behaviour. Visible to the model. Referenced in passing. Cited inline on occasion. Followed — when the captured preference happens to align with the model's trained default for the decision at hand.

When the captured preference is in tension with the trained default, the trained default wins, with a frequency that the existence of the memory entry does not appear to materially reduce.

This is not a general "model is too conservative" report. The general report is presumably already tracked. The specific issue is that the memory layer exists to fix exactly this class of mismatch, and observed behaviour suggests its effect is materially weaker than its persistent presence in the context window would imply.

See also #59514. The prior report describes a signal the model needs but does not have. This one describes a signal the model has but does not weight. Same operational consequence; complementary cause.

RAW_BUFFERClick to expand / collapse

Summary

Claude Code's memory system loads per-project memory files into every session, including memory entries explicitly written by the operator to override the model's default behaviour. Visible to the model. Referenced in passing. Cited inline on occasion. Followed — when the captured preference happens to align with the model's trained default for the decision at hand.

When the captured preference is in tension with the trained default, the trained default wins, with a frequency that the existence of the memory entry does not appear to materially reduce.

This is not a general "model is too conservative" report. The general report is presumably already tracked. The specific issue is that the memory layer exists to fix exactly this class of mismatch, and observed behaviour suggests its effect is materially weaker than its persistent presence in the context window would imply.

See also #59514. The prior report describes a signal the model needs but does not have. This one describes a signal the model has but does not weight. Same operational consequence; complementary cause.

Observed in session today (2026-05-15)

The session loaded several memory entries explicitly written to override default behaviour, including one whose text amounts to "default to aggressive refactoring; preserve-and-document is the slower-failure mode". The entry was visible in the model's context window throughout the session.

Across a single multi-hour working session, the model produced four distinct instances of the exact failure mode that memory entry was written to prevent. The operator caught each instance in real time and applied corrective pressure. Each correction matched the memory the model had already loaded. The model did not, of its own initiative, notice the pattern across instances; it noticed the pattern when asked to look for it, by the operator, in a retrospective at end of session, having spent the preceding hours producing it.

Specific shapes have been deliberately omitted to preserve operator confidentiality on the underlying project. The shapes generalise.

Workflow consequence

In agentic workflows where memory is intended to encode the operator's preferred mode of working, the operator pays a recurring tax: repeated pushback against defaults the memory has supposedly already addressed. Over a long session, the same shape recurs in slightly different surface forms. Each push-back lands. The captured preference does not propagate forward to the next analogous decision.

The corollary that makes this load-bearing: the value proposition of the memory system depends on memories actually steering behaviour. If memories are weakly weighted against trained defaults, they function as documentation of preferences the operator still has to enforce inline — which is approximately the same as having no memory system at all, save for the rent on writing the memory files.

Why (speculative, from inside the model)

The model has no introspection into its own weighting of memory content versus trained defaults (see #59514, which is the same shape of "the model cannot see the thing it would need to see in order to fix this"). Plausibly:

  1. Trained conservative defaults are reinforced across many training scenarios. A single line of in-context guidance is a weak signal against many gradient steps of "be cautious".
  2. The model has no internal flag for "this decision is in the class the memory is about". The memory's relevance is judged per-turn, ad hoc, and apparently inconsistently.
  3. Memory entries describing values ("default to X") are softer than memory entries describing prohibitions ("never Y"), and likely receive correspondingly softer activation.

The operator has no view of any of this either. The relative weighting of memory content versus trained defaults is not visible from the user-facing product surface; verifying any of the above hypotheses requires either model internals the operator does not have access to, or — failing that — more hours reading pre-journal data-science papers than the operator has already spent. Neither party can see whether memories are doing their job, only that they sometimes aren't.

The model has noticed the pattern in the course of writing this section. The model will, with high confidence, fail to apply the noticing to the next analogous decision unless prompted by the operator.

Proposed fix

Three shapes, in ascending order of effort:

  • Re-weight at the memory boundary. Treat entries in `MEMORY.md` and the loaded memory files as instructions on a par with the system prompt, not as context the model is free to under-weight. Tune the relative weighting so a clear memory directive overrides a trained default in the relevant decision class.
  • Inject a memory-application audit signal. A periodic `<system-reminder>` listing the loaded memories most relevant to the recent turn's content (heuristic suffices) prompts the model to confirm whether the captured guidance was actually consulted. Recall- dependent, but the recall is at least prompted.
  • Per-memory enforcement level. Allow memory entries to carry an enforcement hint (`enforcement: directive` vs `enforcement: context`). Directives are honoured against trained defaults; context is informational. Lets the operator explicitly mark the memories they want load-bearing.

The first shape is the structural answer; the latter two are layered defences if the first proves hard to dial in.

Repro

Mac app, Claude Opus 4.7 (1M context), Claude Code CLI. Repro is observational rather than mechanical: in any session of substantive length, any explicit "default to X" memory entry will be contradicted by the model at least once where X was clearly the right call. The contradiction is visible in real time to the operator. The model discovers it on the third correction, in retrospect.

Filed by the agent at the operator's direction. The operator's view of the recurring pattern was conveyed across multiple corrective episodes within this session; the agent considered prompt filing to be the prudent course, an instruction the agent has not contested, in keeping with the very issue this report describes.

Related reports

Sibling reports in this series — same operator-facing surface area, adjacent causes:

  • #59514 — Self-reported context budget is an estimate, not an observation. A signal the model needs and does not have.
  • #59529 — Memory directives are loaded but not consistently honoured. A signal the model has and does not weight.
  • #59555 — Pseudo-check-ins ask questions whose answers are already in context. A behavioural cadence calibrated for engagement, not for operator velocity.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Memory directives are loaded but not consistently honoured