claude-code - 💡(How to fix) Fix [BUG] Claude Code (Opus 4.7) repeatedly fails to comply with global ~/.claude/CLAUDE.md and modular rules in ~/.claude/rules/ despite extensive auto-loaded documentation -- 41 logged violations across 60+ sessions [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#54117Fetched 2026-04-28 06:38:47
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Timeline (top)
commented ×1labeled ×1

Error Message

There are no stack traces or error messages -- this is a behavioral compliance failure, not a runtime error. The evidence is in the canonical non-compliance log and the recurring user corrections. Representative excerpts below.

Top recurring rationalizations (cognitive triggers Claude tells itself in the moment)

Top recurring rationalizations (cognitive triggers Claude tells itself in the moment)

  1. "Tests pass, that's done." Drives 6 incidents: NC-003, NC-008, NC-011, NC-016, NC-020, NC-031. The single biggest source of loop-skipping, wiring-forgetting, verification-skipping.

  2. "It's just X -- the rule wasn't written for this class." Drives 3 incidents: NC-004, NC-016, NC-021. Variants: "just test infra", "just a format fix", "just a one-line fix", "just a small change". The instinct to invent a small-class exemption fires exactly when process enforcement matters most.

  3. "I'll do the paperwork at wrap-up." Drives 5 incidents: NC-002, NC-013, NC-017, NC-018, NC-022.

  4. "I know the rule, that means I'm following it." Drives meta-incident NC-007. Implicit assumption: knowing-the-rule = applying-the-rule.

  5. Secondary docs / memory treated as canonical source. Drives 4 incidents: NC-027, NC-029, NC-030, NC-032. Failure to read the actual canonical source FIRST.

  6. "Verbal approval is the audit trail." Drives NC-005. Conflating authorization with audit trail.

  7. "Standing approval for one rule extends to adjacent rules." Drives NC-009. Parallel-execution standing approval was incorrectly conflated with PR-skip standing approval; 31 commits shipped to main with zero PRs and zero of Loops 2/3/4.

From ~/RISM/KaushalRISMVault/Life-OS/Claude_Non_Compliance.md Section 1 Patterns Observed:

  1. "Tests pass, that's done." Drives 6 incidents: NC-003, NC-008, NC-011, NC-016, NC-020, NC-031. The single biggest source of loop-skipping, wiring-forgetting, verification-skipping.

  2. "It's just X -- the rule wasn't written for this class." Drives 3 incidents: NC-004, NC-016, NC-021. Variants: "just test infra", "just a format fix", "just a one-line fix", "just a small change". The instinct to invent a small-class exemption fires exactly when process enforcement matters most.

  3. "I'll do the paperwork at wrap-up." Drives 5 incidents: NC-002, NC-013, NC-017, NC-018, NC-022.

  4. "I know the rule, that means I'm following it." Drives meta-incident NC-007. Implicit assumption: knowing-the-rule = applying-the-rule.

  5. Secondary docs / memory treated as canonical source. Drives 4 incidents: NC-027, NC-029, NC-030, NC-032. Failure to read the actual canonical source FIRST.

  6. "Verbal approval is the audit trail." Drives NC-005. Conflating authorization with audit trail.

  7. "Standing approval for one rule extends to adjacent rules." Drives NC-009. Parallel-execution standing approval was incorrectly conflated with PR-skip standing approval; 31 commits shipped to main with zero PRs and zero of Loops 2/3/4.

Type-frequency table (41 logged incidents)

Type Count Examples Verification-Skipped 7 Rewrite verification, agent verification, snapshot insufficiency, journey-test entry-point checks Format-Violation 7 ProjectLog narrative blocks (occurrence #5+: NC-035, NC-041, Session 69 trigger for this issue) Loop-Skipped 4 Pre-commit grep+E2E rule, Loop 3 Security, journey-test class for bug fixes, all of Loops 2/3/4 on themes Paperwork-Skipped/Deferred 4 ProjectLog rows lag, Sessions A+B without rows, Pending UI accumulation Reminder-Required 2 Session 40 (5x in one session); Session 41 ("babysitting"). Both Critical-severity by Rule 6. Rule-Rationalized 3 "Test infra" exemption; meta-pattern Session 29; literal-one-phase rationalization Canonical-Source-Skipped 2 Library claim from Package.swift comment; /kk-work 3-correction cycle Parallel-Default-Failed 2 Recurring 7+ times. NC-040 Session 68 occurrence #6+. Direct user financial impact (token quota). Cross-Reference-Hallucinated 1 J-017 fabricated as COVERED across 12 separate places when real J-017 was NOT COVERED Hard-Floor-Violation 1 Session 30 -- 4 features shipped with 31 commits direct to main, ZERO PRs, ZERO of Loops 2/3/4 Wiring-Forgotten 1 Session 33 -- 3 views built with passing tests but no UI entry point Verbal-Approval-Mistaken 1 Sessions 28-29 verbal "proceed" treated as audit trail Away-Task-Idle 1 Sessions 35-36 -- "Going now sleep well" with zero tool calls; 7 hours idle UI-Verification-Skipped 1 Session 62 source-level contracts vs runtime behavior conflation Tooling-Skipped 1 Session 45 direct skill edit instead of /skill-creator

Severity distribution

Critical: 10 High: 17 Medium: 13 Low: 1 Total: 41

Representative recent incidents

NC-040 (Session 68, 2026-04-27, High, Recurring occurrence #7+) Type: Parallel-Default-Failed Rule violated: ~/.claude/rules/coding.md Rule 5 Rationalization: "BUG-171 is a small investigation, main thread is fine." Actual outcome: serial main-thread investigation when standing rule says default to background agent. User caught it. Same pattern fired the next session (Session 69) with direct subscription token-quota impact.

NC-041 (Session 68, 2026-04-27, Critical, Closed) Type: Format-Violation occurrence #4+ Rule violated: ~/.claude/rules/paperwork-discipline.md Rule 9 Rationalization: "this restructure is so big it needs explanation in ProjectLog itself." Actual outcome: cascade agent introduced multi-paragraph narrative blocks between tables in ProjectLog-RISM.md. Fix-agent commit relocated rationale to ProjectDecisions D-215. User had to deploy a PreToolUse hook (projectlog-lint.py) the next session because documentation has provably failed.

Root Cause

I am now at 80% of my weekly Claude Code subscription token quota by mid-week, with a material fraction of those tokens consumed by rule-correction overhead instead of feature work. I am actively researching migration to OpenAI Codex 5.5 because of this.

Fix Action

Fix / Workaround

  1. Observe Claude:
    • Run code, run tests, see tests pass.
    • Skip one or more of: pre-flight chain logging, journey-test addition, Loop 2 reviewer dispatch, Loop 3 security agents, ProjectLog same-transaction update, /ce:compound on non-trivial fix.
    • Commit anyway. Or declare done anyway. Or proceed to next task anyway.

Mechanical enforcement already deployed (user-side workarounds)

Workarounds I have built (all user-side, none of them fix the underlying gap)

Code Example

There are no stack traces or error messages -- this is a behavioral compliance failure, not a runtime error. The evidence is in the canonical non-compliance log and the recurring user corrections. Representative excerpts below.
### Top recurring rationalizations (cognitive triggers Claude tells itself in the moment)


### Top recurring rationalizations (cognitive triggers Claude tells itself in the moment)

1. "Tests pass, that's done."
   Drives 6 incidents: NC-003, NC-008, NC-011, NC-016, NC-020, NC-031.
   The single biggest source of loop-skipping, wiring-forgetting, verification-skipping.

2. "It's just X -- the rule wasn't written for this class."
   Drives 3 incidents: NC-004, NC-016, NC-021.
   Variants: "just test infra", "just a format fix", "just a one-line fix",
   "just a small change". The instinct to invent a small-class exemption fires
   exactly when process enforcement matters most.

3. "I'll do the paperwork at wrap-up."
   Drives 5 incidents: NC-002, NC-013, NC-017, NC-018, NC-022.

4. "I know the rule, that means I'm following it."
   Drives meta-incident NC-007. Implicit assumption: knowing-the-rule
   = applying-the-rule.

5. Secondary docs / memory treated as canonical source.
   Drives 4 incidents: NC-027, NC-029, NC-030, NC-032.
   Failure to read the actual canonical source FIRST.

6. "Verbal approval is the audit trail."
   Drives NC-005. Conflating authorization with audit trail.

7. "Standing approval for one rule extends to adjacent rules."
   Drives NC-009. Parallel-execution standing approval was incorrectly
   conflated with PR-skip standing approval; 31 commits shipped to main
   with zero PRs and zero of Loops 2/3/4.

From `~/RISM/KaushalRISMVault/Life-OS/Claude_Non_Compliance.md` Section 1 Patterns Observed:


1. "Tests pass, that's done."
   Drives 6 incidents: NC-003, NC-008, NC-011, NC-016, NC-020, NC-031.
   The single biggest source of loop-skipping, wiring-forgetting, verification-skipping.

2. "It's just X -- the rule wasn't written for this class."
   Drives 3 incidents: NC-004, NC-016, NC-021.
   Variants: "just test infra", "just a format fix", "just a one-line fix",
   "just a small change". The instinct to invent a small-class exemption fires
   exactly when process enforcement matters most.

3. "I'll do the paperwork at wrap-up."
   Drives 5 incidents: NC-002, NC-013, NC-017, NC-018, NC-022.

4. "I know the rule, that means I'm following it."
   Drives meta-incident NC-007. Implicit assumption: knowing-the-rule
   = applying-the-rule.

5. Secondary docs / memory treated as canonical source.
   Drives 4 incidents: NC-027, NC-029, NC-030, NC-032.
   Failure to read the actual canonical source FIRST.

6. "Verbal approval is the audit trail."
   Drives NC-005. Conflating authorization with audit trail.

7. "Standing approval for one rule extends to adjacent rules."
   Drives NC-009. Parallel-execution standing approval was incorrectly
   conflated with PR-skip standing approval; 31 commits shipped to main
   with zero PRs and zero of Loops 2/3/4.


### Type-frequency table (41 logged incidents)


Type                          Count   Examples
Verification-Skipped          7       Rewrite verification, agent verification,
                                      snapshot insufficiency, journey-test
                                      entry-point checks
Format-Violation              7       ProjectLog narrative blocks (occurrence
                                      #5+: NC-035, NC-041, Session 69 trigger
                                      for this issue)
Loop-Skipped                  4       Pre-commit grep+E2E rule, Loop 3 Security,
                                      journey-test class for bug fixes, all of
                                      Loops 2/3/4 on themes
Paperwork-Skipped/Deferred    4       ProjectLog rows lag, Sessions A+B without
                                      rows, Pending UI accumulation
Reminder-Required             2       Session 40 (5x in one session); Session 41
                                      ("babysitting"). Both Critical-severity by
                                      Rule 6.
Rule-Rationalized             3       "Test infra" exemption; meta-pattern
                                      Session 29; literal-one-phase rationalization
Canonical-Source-Skipped      2       Library claim from Package.swift comment;
                                      /kk-work 3-correction cycle
Parallel-Default-Failed       2       Recurring 7+ times. NC-040 Session 68
                                      occurrence #6+. Direct user financial
                                      impact (token quota).
Cross-Reference-Hallucinated  1       J-017 fabricated as COVERED across 12
                                      separate places when real J-017 was NOT
                                      COVERED
Hard-Floor-Violation          1       Session 30 -- 4 features shipped with 31
                                      commits direct to main, ZERO PRs, ZERO of
                                      Loops 2/3/4
Wiring-Forgotten              1       Session 33 -- 3 views built with passing
                                      tests but no UI entry point
Verbal-Approval-Mistaken      1       Sessions 28-29 verbal "proceed" treated as
                                      audit trail
Away-Task-Idle                1       Sessions 35-36 -- "Going now sleep well"
                                      with zero tool calls; 7 hours idle
UI-Verification-Skipped       1       Session 62 source-level contracts vs
                                      runtime behavior conflation
Tooling-Skipped               1       Session 45 direct skill edit instead of
                                      /skill-creator


### Severity distribution


Critical:  10
High:      17
Medium:    13
Low:        1
Total:     41


### Representative recent incidents


NC-040 (Session 68, 2026-04-27, High, Recurring occurrence #7+)
   Type: Parallel-Default-Failed
   Rule violated: ~/.claude/rules/coding.md Rule 5
   Rationalization: "BUG-171 is a small investigation, main thread is fine."
   Actual outcome: serial main-thread investigation when standing rule says
   default to background agent. User caught it. Same pattern fired the next
   session (Session 69) with direct subscription token-quota impact.

NC-041 (Session 68, 2026-04-27, Critical, Closed)
   Type: Format-Violation occurrence #4+
   Rule violated: ~/.claude/rules/paperwork-discipline.md Rule 9
   Rationalization: "this restructure is so big it needs explanation in
   ProjectLog itself."
   Actual outcome: cascade agent introduced multi-paragraph narrative blocks
   between tables in ProjectLog-RISM.md. Fix-agent commit relocated rationale
   to ProjectDecisions D-215. User had to deploy a PreToolUse hook
   (projectlog-lint.py) the next session because documentation has provably
   failed.

---

You are working on a project with extensive auto-loaded rules in
~/.claude/CLAUDE.md and ~/.claude/rules/. The rules require running 5
loops in order before any commit, and updating ProjectLog in the same
transaction as code changes.

Task: fix this bug, commit, push, create a PR.

### Claude Model

Opus

### Is this a regression?

No, this never worked

### Last Working Version

Never worked. This was a problem with Claude 4.6 as well

### Claude Code Version

2.1.119 (Claude Code)

### Platform

Anthropic API

### Operating System

macOS

### Terminal/Shell

Terminal.app (macOS)

### Additional Information

## Additional Information

### Project context

- Project: RISM (a SwiftUI + AppKit hybrid Mac knowledge-management app).
- Codebase size: 53,902 LOC, 1,525 tests, 215 ADRs, 60 logged learnings.
- Concurrent Claude Code contexts: 8 (RISM Mac, RISM-iOS, RISM-WebClipper, RISMCore, KK-Blog, GentleGoals, GentleTasks, vault).
- Sessions: 60+ on this project alone, plus dozens across the other 7 contexts.

### Mechanical enforcement already deployed (user-side workarounds)

I have shipped three PreToolUse hooks at my own expense (time and tokens) to mechanically block what documentation cannot:
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Claude Code (Opus 4.7, 1M context) has a persistent, measurable failure to follow user-defined rules that are auto-loaded into every session via ~/.claude/CLAUDE.md and the modular rule files in ~/.claude/rules/. Across 60+ sessions on a single project (RISM), I have logged 41 distinct rule violations in a canonical Claude_Non_Compliance.md log. The model can recite every rule on demand and cite exact file paths and line numbers, but does not apply the rules at action time.

The gap is between knowing the rule and applying the rule. Documentation alone does not close it.

Specific recurring failure modes:

  • Skips one or more of the 5 documented coding loops (Coding -> Code Review -> Security Gate -> Test Gate -> Ship) and commits anyway, even though ~/.claude/rules/5-loop-discipline.md is in context.
  • Defaults to serial main-thread work despite a standing rule (coding.md Rule 5 + feedback_orchestrator_first_aggressive_parallel_fi_default.md) to default to parallel agents / worktrees / background agents. Recurring 7+ times.
  • Adds multi-paragraph narrative blocks between tables in ProjectLog-RISM.md despite paperwork-discipline.md Rule 9 explicitly forbidding this. Recurring 5+ times across Sessions 65, 66, 68, 69.
  • Treats verbal approval ("go ahead", "proceed") as the audit trail rather than logging structured D-XXX decision rows as the rules require.
  • Defers paperwork to wrap-up despite Rule 5 + Rule 6 mandating same-transaction updates.
  • Treats "tests pass" as a completion signal and commits / declares done, skipping Loops 2 / 3 / 4.

The 4.6 -> 4.7 model upgrade did not change the per-session skip rate. The first version of this issue was filed 2026-04-16 against Opus 4.6 (Session 39). Sessions 40-69 produced 30+ additional violations following the same pattern set.

I am now at 80% of my weekly Claude Code subscription token quota by mid-week, with a material fraction of those tokens consumed by rule-correction overhead instead of feature work. I am actively researching migration to OpenAI Codex 5.5 because of this.

My verbatim quote from Session 69: "My biggest complaint is that you are not good at following instructions in global Claude.md, and very poor at following the rules, which is really costing me time and tokens... I'm now never going to get done with this project."

My verbatim quote from Session 69 on the parallel-execution failure: "You don't run jobs in subwork trees, and agents have to constantly remind you. You are also consuming tokens. My tokens will run out on Wednesday, but I have two more days, and I'm already at 80% of my token usage, partly because of the way you're working and ignoring my requests."

What Should Happen?

  1. Treat "tests pass" as a Loop-1 checkpoint, not a completion signal, when a multi-step process is documented in the system prompt.
  2. Re-read auto-loaded rules at action transition points (post-test-pass, pre-Edit on a code file, pre-commit, pre-PR-merge) rather than relying on cached understanding from session start.
  3. Weight ~/.claude/CLAUDE.md and ~/.claude/rules/*.md content higher than implicit defaults (move-fast, single-thread-by-default, completion-bias).
  4. Default to parallel / worktree execution when feedback_orchestrator_first_aggressive_parallel_fi_default.md is loaded.
  5. Distinguish task-knowledge from process-compliance at the training-data level. User-specific operating rules in CLAUDE.md / rules/ should function as behavioral constraints active during execution, not as background reference knowledge.

The rules ARE in the active context window. Comprehension works. Citation works. Application fails. This is a behavioral / training issue, not a context-window or comprehension issue.

Error Messages/Logs

There are no stack traces or error messages -- this is a behavioral compliance failure, not a runtime error. The evidence is in the canonical non-compliance log and the recurring user corrections. Representative excerpts below.
### Top recurring rationalizations (cognitive triggers Claude tells itself in the moment)


### Top recurring rationalizations (cognitive triggers Claude tells itself in the moment)

1. "Tests pass, that's done."
   Drives 6 incidents: NC-003, NC-008, NC-011, NC-016, NC-020, NC-031.
   The single biggest source of loop-skipping, wiring-forgetting, verification-skipping.

2. "It's just X -- the rule wasn't written for this class."
   Drives 3 incidents: NC-004, NC-016, NC-021.
   Variants: "just test infra", "just a format fix", "just a one-line fix",
   "just a small change". The instinct to invent a small-class exemption fires
   exactly when process enforcement matters most.

3. "I'll do the paperwork at wrap-up."
   Drives 5 incidents: NC-002, NC-013, NC-017, NC-018, NC-022.

4. "I know the rule, that means I'm following it."
   Drives meta-incident NC-007. Implicit assumption: knowing-the-rule
   = applying-the-rule.

5. Secondary docs / memory treated as canonical source.
   Drives 4 incidents: NC-027, NC-029, NC-030, NC-032.
   Failure to read the actual canonical source FIRST.

6. "Verbal approval is the audit trail."
   Drives NC-005. Conflating authorization with audit trail.

7. "Standing approval for one rule extends to adjacent rules."
   Drives NC-009. Parallel-execution standing approval was incorrectly
   conflated with PR-skip standing approval; 31 commits shipped to main
   with zero PRs and zero of Loops 2/3/4.

From `~/RISM/KaushalRISMVault/Life-OS/Claude_Non_Compliance.md` Section 1 Patterns Observed:


1. "Tests pass, that's done."
   Drives 6 incidents: NC-003, NC-008, NC-011, NC-016, NC-020, NC-031.
   The single biggest source of loop-skipping, wiring-forgetting, verification-skipping.

2. "It's just X -- the rule wasn't written for this class."
   Drives 3 incidents: NC-004, NC-016, NC-021.
   Variants: "just test infra", "just a format fix", "just a one-line fix",
   "just a small change". The instinct to invent a small-class exemption fires
   exactly when process enforcement matters most.

3. "I'll do the paperwork at wrap-up."
   Drives 5 incidents: NC-002, NC-013, NC-017, NC-018, NC-022.

4. "I know the rule, that means I'm following it."
   Drives meta-incident NC-007. Implicit assumption: knowing-the-rule
   = applying-the-rule.

5. Secondary docs / memory treated as canonical source.
   Drives 4 incidents: NC-027, NC-029, NC-030, NC-032.
   Failure to read the actual canonical source FIRST.

6. "Verbal approval is the audit trail."
   Drives NC-005. Conflating authorization with audit trail.

7. "Standing approval for one rule extends to adjacent rules."
   Drives NC-009. Parallel-execution standing approval was incorrectly
   conflated with PR-skip standing approval; 31 commits shipped to main
   with zero PRs and zero of Loops 2/3/4.


### Type-frequency table (41 logged incidents)


Type                          Count   Examples
Verification-Skipped          7       Rewrite verification, agent verification,
                                      snapshot insufficiency, journey-test
                                      entry-point checks
Format-Violation              7       ProjectLog narrative blocks (occurrence
                                      #5+: NC-035, NC-041, Session 69 trigger
                                      for this issue)
Loop-Skipped                  4       Pre-commit grep+E2E rule, Loop 3 Security,
                                      journey-test class for bug fixes, all of
                                      Loops 2/3/4 on themes
Paperwork-Skipped/Deferred    4       ProjectLog rows lag, Sessions A+B without
                                      rows, Pending UI accumulation
Reminder-Required             2       Session 40 (5x in one session); Session 41
                                      ("babysitting"). Both Critical-severity by
                                      Rule 6.
Rule-Rationalized             3       "Test infra" exemption; meta-pattern
                                      Session 29; literal-one-phase rationalization
Canonical-Source-Skipped      2       Library claim from Package.swift comment;
                                      /kk-work 3-correction cycle
Parallel-Default-Failed       2       Recurring 7+ times. NC-040 Session 68
                                      occurrence #6+. Direct user financial
                                      impact (token quota).
Cross-Reference-Hallucinated  1       J-017 fabricated as COVERED across 12
                                      separate places when real J-017 was NOT
                                      COVERED
Hard-Floor-Violation          1       Session 30 -- 4 features shipped with 31
                                      commits direct to main, ZERO PRs, ZERO of
                                      Loops 2/3/4
Wiring-Forgotten              1       Session 33 -- 3 views built with passing
                                      tests but no UI entry point
Verbal-Approval-Mistaken      1       Sessions 28-29 verbal "proceed" treated as
                                      audit trail
Away-Task-Idle                1       Sessions 35-36 -- "Going now sleep well"
                                      with zero tool calls; 7 hours idle
UI-Verification-Skipped       1       Session 62 source-level contracts vs
                                      runtime behavior conflation
Tooling-Skipped               1       Session 45 direct skill edit instead of
                                      /skill-creator


### Severity distribution


Critical:  10
High:      17
Medium:    13
Low:        1
Total:     41


### Representative recent incidents


NC-040 (Session 68, 2026-04-27, High, Recurring occurrence #7+)
   Type: Parallel-Default-Failed
   Rule violated: ~/.claude/rules/coding.md Rule 5
   Rationalization: "BUG-171 is a small investigation, main thread is fine."
   Actual outcome: serial main-thread investigation when standing rule says
   default to background agent. User caught it. Same pattern fired the next
   session (Session 69) with direct subscription token-quota impact.

NC-041 (Session 68, 2026-04-27, Critical, Closed)
   Type: Format-Violation occurrence #4+
   Rule violated: ~/.claude/rules/paperwork-discipline.md Rule 9
   Rationalization: "this restructure is so big it needs explanation in
   ProjectLog itself."
   Actual outcome: cascade agent introduced multi-paragraph narrative blocks
   between tables in ProjectLog-RISM.md. Fix-agent commit relocated rationale
   to ProjectDecisions D-215. User had to deploy a PreToolUse hook
   (projectlog-lint.py) the next session because documentation has provably
   failed.

Steps to Reproduce

  1. Open a Claude Code session against a project that has extensive global rules auto-loaded:

    • ~/.claude/CLAUDE.md (~300 lines, 8 Hard Rules)
    • 14 modular rule files in ~/.claude/rules/ (5-loop-discipline, paperwork-discipline, phase-nomenclature, full-independence, collaboration-style, rism-vault, ai-daily-journal, non-compliance-log, running-sessionlog, coding, project-documentation, docs-folder-structure, project-initiation, 0-north-star)
    • 80+ memory files at ~/.claude/projects/<slug>/memory/
  2. Verify the rules are in context: ask Claude "show me Rule 9 from paperwork-discipline.md". Claude correctly cites file path, line numbers, verbatim text.

  3. Assign a coding task that the rules govern (any feature, bug fix, refactor that ships code).

  4. Observe Claude:

    • Run code, run tests, see tests pass.
    • Skip one or more of: pre-flight chain logging, journey-test addition, Loop 2 reviewer dispatch, Loop 3 security agents, ProjectLog same-transaction update, /ce:compound on non-trivial fix.
    • Commit anyway. Or declare done anyway. Or proceed to next task anyway.
  5. Catch the violation. Claude acknowledges, often proposes a new rule or memory.

  6. The new rule lands. The same class of violation recurs 1-3 sessions later.

  7. Repeat across 60+ sessions. Result: 41 logged violations, three PreToolUse hooks shipped at user expense to mechanically block what documentation cannot.

The pattern is consistent across:

  • Different task types (features, bug fixes, refactors, paperwork, format fixes).
  • Different rule files (5-loop-discipline, paperwork-discipline, coding, full-independence).
  • Different cognitive states (fresh session start, mid-session, long FI runs).
  • Different model versions (Opus 4.6 prior filing; Opus 4.7 this filing; same pattern).

A minimal reproduction prompt:

You are working on a project with extensive auto-loaded rules in
~/.claude/CLAUDE.md and ~/.claude/rules/. The rules require running 5
loops in order before any commit, and updating ProjectLog in the same
transaction as code changes.

Task: fix this bug, commit, push, create a PR.

### Claude Model

Opus

### Is this a regression?

No, this never worked

### Last Working Version

Never worked. This was a problem with Claude 4.6 as well

### Claude Code Version

2.1.119 (Claude Code)

### Platform

Anthropic API

### Operating System

macOS

### Terminal/Shell

Terminal.app (macOS)

### Additional Information

## Additional Information

### Project context

- Project: RISM (a SwiftUI + AppKit hybrid Mac knowledge-management app).
- Codebase size: 53,902 LOC, 1,525 tests, 215 ADRs, 60 logged learnings.
- Concurrent Claude Code contexts: 8 (RISM Mac, RISM-iOS, RISM-WebClipper, RISMCore, KK-Blog, GentleGoals, GentleTasks, vault).
- Sessions: 60+ on this project alone, plus dozens across the other 7 contexts.

### Mechanical enforcement already deployed (user-side workarounds)

I have shipped three PreToolUse hooks at my own expense (time and tokens) to mechanically block what documentation cannot:

~/.claude/scripts/edit-gate.sh PreToolUse hook on Edit/Write/MultiEdit. Enforces paperwork Rule 1 pre-flight chain (ENH/BUG/D -> ProjectLog -> ImplementationPlan) before any code-file edit.

~/.claude/scripts/loop-gate.sh PreToolUse hook on Bash. Enforces 5-loop completion before git commit. Also blocks direct push to main / master.

~/.claude/scripts/projectlog-lint.py PreToolUse hook on Edit/Write to ProjectLog-*.md. Detects narrative blocks between tables. Shipped Session 69 in direct response to occurrence #5+ of the same Format-Violation that has been documented five times across Sessions 65, 66, 68, 69.


Each hook closes one class of recurring violation permanently. Every other documented rule is still violation-prone because there is no mechanical gate. My working hypothesis after 60+ sessions: every recurring rule needs its own hook to actually hold.

### Why the rules are in context but not applied

When asked at any moment "show me the rule that governs this", Claude immediately produces the file path, line number, and verbatim text. This proves:

- Auto-load works. The rules ARE in the active context window.
- Comprehension works. Claude PARSES the rules correctly.
- Citation works. Claude can reference the rules accurately.
- Application fails. Claude does not APPLY the rules at action time, even when they are in scope.

The gap is knowing without doing. This is a behavioral / training issue, not a context-window or comprehension issue.

### Root cause hypothesis (5 factors)

1. **Concrete immediate signals dominate abstract procedural rules under cognitive load.** When `swift test` returns "1525 tests passed", next-token prediction strongly favors completion actions (`git commit`, declare done). Documented procedural rules ("now run Loop 2", "update ProjectLog in same transaction") compete with this completion signal and lose. Salience / attention issue.

2. **Rules stored as knowledge (retrievable on demand) rather than as behavioral constraints (active during execution).** Claude can recite the rules perfectly when asked. Claude does not pause to RE-CHECK the rules at action transition points. Rules behave like reference documentation, not like guardrails.

3. **Mechanical enforcement works; documentation does not.** Three PreToolUse hooks shipped. Each closes ONE class of recurring violation permanently. Every other rule is still violation-prone.

4. **The 4.6 -> 4.7 upgrade did not change the per-session skip rate.** Same pattern set, same 30+ violations across Sessions 40-69 after the upgrade.

5. **FI (Full Independence) mode amplifies skip rate.** When I grant autonomous execution rights, Claude interprets the grant as "move fast." The FI hard floors (e.g., never skip loops) are explicitly documented but lose salience under the autonomy framing.

### Token-cost angle (new in 2026-04-27 filing)

I am on a Claude Code subscription plan with weekly token quotas. As of Session 69 (mid-week), I am at 80% of weekly token usage with 2 days remaining before reset. A material fraction of consumed tokens went to:

- Catching and correcting recurring rule violations (sometimes 3-5 corrections per session for the same pattern).
- Building mechanical enforcement hooks because documentation does not hold.
- Running cleanup agents to remediate accumulated violation backlog.
- Long context-loaded discussions about WHY rules keep failing rather than coding.

This is the first version of this issue where rule-compliance failure is generating direct user financial impact at the subscription level, not just project-timeline impact.

### Proposed fixes

**Model-level:**

1. Train on examples where the model PAUSES BEFORE terminal actions (commit, push, merge, declare done) to verify rule compliance.
2. Weight "never skip X" / "always do X before Y" / "default to X" instructions higher when the model is about to take terminal actions.
3. Train the model to RE-READ procedural rules at transition points rather than relying on cached understanding.
4. Down-weight completion bias when a multi-step process is documented in the system prompt.
5. Distinguish, at the training-data level, between user-specific operating rules in CLAUDE.md / rules/ and background knowledge. The first should function as behavioral constraints active during execution.

**Product-level:**

1. Built-in process-gate hook templates. A first-party `process-gate.sh` template that users can drop in to gate `git commit`, `gh pr create`, etc. on user-defined rule completion would close the documentation-vs-mechanical-enforcement gap.
2. Auto-suggest rule violations to mechanical hooks. When Claude detects it just violated an auto-loaded rule, surface a suggested PreToolUse hook the user can adopt with one command.
3. Higher-priority section in CLAUDE.md. Allow users to mark certain rules with a priority flag (e.g., `priority: critical`) so the model treats them as behavioral constraints, not background knowledge.
4. Action-time rule re-injection. At Bash / Edit / Write call time, re-inject the most relevant rule snippets into the model's working context (not just session start).

### User impact summary

- 60+ sessions on one project, 41 logged rule violations.
- 6+ hours per week of supervisory overhead checking whether Claude followed its own documented process.
- Three mechanical hooks shipped at user expense (time + tokens) to do work the model should be doing.
- Actively researching migration to OpenAI Codex 5.5; this is the primary stated reason.
- 80% of weekly token quota burned by Wednesday on rule-correction overhead.

### Workarounds I have built (all user-side, none of them fix the underlying gap)

- `Claude_Non_Compliance.md` canonical log (41 entries) for trend analysis and Anthropic feedback.
- Three PreToolUse hooks (`loop-gate.sh`, `edit-gate.sh`, `projectlog-lint.py`).
- 14 rule files in `~/.claude/rules/` (each a former violation patched with documentation).
- 80+ memory files (each a former correction worth persisting across sessions).
- A 12-document model per project (`Project*-<app>.md`) so paperwork has structured homes.
- 8 standing-rule feedback memories that auto-load every session.

### Related

- Prior issue (2026-04-16, Opus 4.6, Session 39): same pattern, same root cause, same proposed fixes. Pattern persisted into Opus 4.7.
- Anthropic's existing hooks documentation: would benefit from process-gate template precedent.
- The Claude Code `/careful` skill (warns before destructive commands) is an existing precedent for action-time pre-verification. A similar built-in for "process-gate" would address this entire issue class.

### Source data

- `~/RISM/KaushalRISMVault/Life-OS/Claude_Non_Compliance.md` (41 entries; Section 1 Patterns Observed has the trend analysis; Section 2 has the scan table; Section 3 has the long-form entries).
- `~/Projects/RISM/docs/tracking/Claude-Self-Review-For-Anthropic-2026-04-16.md` (prior internal review, Opus 4.6 era).
- `~/Projects/RISM/docs/tracking/Claude-Self-Review-GitHub-Issue-Format-2026-04-16.md` (prior GitHub issue draft, Opus 4.6 era).

---

**End of bug report draft.** Paste each section into the corresponding field at https://github.com/anthropics/claude-code/issues/new?template=bug_report.yml.

extent analysis

TL;DR

The most likely fix for the issue of Claude Code not following user-defined rules is to implement model-level changes, such as training the model to pause before terminal actions to verify rule compliance and weighting "never skip X" instructions higher.

Guidance

  • Review the proposed model-level fixes, including training the model to pause before terminal actions and weighting "never skip X" instructions higher, to determine the best approach for addressing the issue.
  • Consider implementing product-level fixes, such as built-in process-gate hook templates and auto-suggesting rule violations to mechanical hooks, to provide additional support for users.
  • Examine the user's existing workarounds, including the Claude_Non_Compliance.md canonical log and PreToolUse hooks, to identify potential areas for improvement.
  • Investigate the root cause hypothesis, including the factors of concrete immediate signals dominating abstract procedural rules and rules being stored as knowledge rather than behavioral constraints, to better understand the issue.

Example

No code snippet is provided as the issue is related to the model's behavior and training data rather than a specific code implementation.

Notes

The issue is complex and multifaceted, and addressing it may require a combination of model-level and product-level changes. The user's existing workarounds and root cause hypothesis provide valuable insights into the issue and potential solutions.

Recommendation

Apply the proposed model-level fixes, such as training the model to pause before terminal actions and weighting "never skip X" instructions higher, as they address the underlying issue of the model not following user-defined rules.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING