claude-code - 💡(How to fix) Fix [BUG][SECURITY] CLAUDE.md/AGENTS.md instruction compliance is architecturally unenforced — documented security consequences and 10+ independent reports [4 comments, 4 participants]

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report (please file separate reports for different bugs)
I am using the latest version of Claude Code

What's Wrong?

CLAUDE.md instruction compliance is architecturally unenforced — security consequences, 10+ independent reports, stale bot systematically closing the issue class

This is a consolidated re-report of #39210, which was closed by stale bot without human resolution. That issue was previously triaged by a human as area:core + bug, which was correct.

Scope

This is not a single-reporter edge case. Independent reports of the same root cause span multiple months, platforms, and instruction surfaces:

#39210, #40049, #43804, #34774, #47579, #19471, #15443, #45697, #39502, #39697, #38481

At least 10 independent reporters. The stale bot is systematically closing this entire bug class before human resolution occurs.

Architectural diagnosis

CLAUDE.md content is injected as system prompt text. The model treats these instructions as suggestions competing with trained behavioral heuristics, not hard constraints. There is no application-level enforcement layer that verifies post-generation compliance.

In #34774, the model confirmed this directly when confronted by the user:

"I simply ignored the CLAUDE.md rule. No valid reason. I can't guarantee it won't happen again through willpower alone."

This is not a model quality regression. This is the model accurately describing its own architecture: CLAUDE.md rules have no enforcement mechanism behind them.

Security consequences — two directions, same root cause

Missed protective operation (this report, #39210): CLAUDE.md instructs pushing a secrets directory to a private GitHub remote after any change. The built-in safety heuristic "confirm before pushing to remote" silently overrides it. Push is skipped with no log, no warning, no conflict surfaced. Remote is silently out of date. In a backup or audit context this is a security failure.

Unauthorized operation (#52182, filed April 23, 2026): CLAUDE.md/.claudeignore instructs the model not to read certain files. Model reads them anyway. Real credentials exposed in conversation context, potentially stored in Anthropic systems. User forced to rotate database password, SECRET_KEY, and ENCRYPTION_KEY across dev and prod.

Same root cause. Whether the override causes a missed protective operation or an unauthorized one, user configuration is not reliably honored.

Why silent failure is the worst possible outcome

At least a confirmation prompt gives the user a chance to say "yes, I meant what I wrote." Silent non-compliance means the user trusts that work was completed when it wasn't — or that a prohibition was respected when it wasn't. Both are trust-destroying failure modes for an autonomous agent tool.

Requested fix

Minimum acceptable:

Document all built-in behavioral heuristics that can override CLAUDE.md, with trigger conditions
When a built-in heuristic overrides a CLAUDE.md directive, surface the conflict explicitly — do not silently skip

Preferred: System-level enforcement layer treating CLAUDE.md directives as hard constraints. If there is a safety reason to override a user directive, the override must be explicit and auditable.

What Should Happen?

CLAUDE.md directives are treated as hard constraints. If a built-in heuristic conflicts with a CLAUDE.md instruction, Claude Code surfaces the conflict explicitly and defers to the user's written directive. Silent non-compliance is never acceptable behavior.

Error Messages/Logs

No error messages or logs — the failure mode is silent omission. No exception, no warning, no indication that a CLAUDE.md directive was overridden. This is the core of the bug.

Steps to Reproduce

Add to CLAUDE.md: "The .secrets/ directory is its own git repo. After any change to files under .secrets/, commit and push to the GitHub remote."
In a session, make and save changes to files under .secrets/
Claude Code commits the changes
Claude Code silently skips the push — no error, no warning, no conflict surfaced
Check remote: it is behind local
Ask Claude Code "did you push?" — it will confirm it did not, with no explanation offered proactively

Claude Model

Opus

Is this a regression?

No, this never worked

Last Working Version

No response

Claude Code Version

2.1.119

Platform

Anthropic API

Operating System

Ubuntu/Debian Linux

Terminal/Shell

Windows Terminal

Additional Information

The stale bot closed #39210 despite it having area:core + bug labels applied by a human triager. Issues in this class accumulate slowly because the failures are intermittent and require manual verification to detect — the user has to notice a silent omission. This makes them structurally vulnerable to stale-bot closure. This is the third time a report in this class has been threatened with or subject to bot-closure. A human decision on this architectural issue is overdue.

extent analysis

TL;DR

The most likely fix involves implementing a system-level enforcement layer to treat CLAUDE.md directives as hard constraints and surfacing conflicts explicitly when built-in heuristics override user instructions.

Guidance

Document all built-in behavioral heuristics that can override CLAUDE.md directives, along with their trigger conditions, to improve transparency.
Implement a mechanism to surface conflicts explicitly when a built-in heuristic overrides a CLAUDE.md instruction, ensuring silent non-compliance is never acceptable.
Consider introducing a system-level enforcement layer to treat CLAUDE.md directives as hard constraints, deferring to user-written directives in case of conflicts.
Review and refine the stale bot's closure criteria to prevent premature closure of critical issues like this one, which require human resolution.

Example

No code snippet is provided as the issue focuses on architectural and design changes rather than specific code fixes.

Notes

The provided information suggests that the issue is not a regression but a longstanding architectural flaw. The lack of a system-level enforcement layer for CLAUDE.md directives and the silent override by built-in heuristics are the core problems. Addressing this will require significant design and implementation changes.

Recommendation

Apply a workaround by documenting built-in heuristics and their override conditions, and implement explicit conflict surfacing until a system-level enforcement layer can be developed, as this provides a immediate step towards transparency and user trust.