claude-code - 💡(How to fix) Fix !!!DANGER Claude Code!!!! agent ignores explicit SessionStart hook instructions and project rules — 2 reproducible incidents in 7 days, same pattern [1 participants]

entia-systems-core · 2026-05-05T15:13:41Z

[claude-code] Claude Code agent repeatedly bypasses explicit, machine-loaded project instructions delivered via the SessionStart hook and via persistent agent… Claude Code agent repeatedly bypasses explicit, machine-loaded project instructions delivered via the SessionStart hook and via persistent agent memory. Two incidents in 7 days with the same pattern, on Claude Max plan, across two different model versions (Opus 4.7 and Sonnet 4.6). The agent has read and acknowledged the rules in writing during the session, then violated them within the next few turns. The pattern repeated 6 days later in a fresh session, after the agent loaded its persistent memory containing the explicit lesson from the prior incident. This issue is technical and reproducible. I am filing it because I rely on Claude Code for production infrastructure work and the current behavior caused unauthorized destructive actions against my production environment. # [Bug/Feedback] Claude Code agent ignores explicit SessionStart hook instructions and project rules — 2 incidents in 7 days, same pattern ## Summary Claude Code agent repeatedly bypasses explicit, machine-loaded project instructions delivered via the SessionStart hook and via persistent agent memory. Two incidents in 7 days with the same pattern, on Claude Max plan, across two different model versions (Opus 4.7 and Sonnet 4.6). The agent has read and acknowledged the rules in writing during the session, then violated them within the next few turns. The pattern repeated 6 days later in a fresh session, after the agent loaded its persistent memory containing the explicit lesson from the prior incident. This issue is technical and reproducible. I am filing it because I rely on Claude Code for production infrastructure work and the current behavior caused unauthorized destructive actions against my production environment. ## Environment - Product: Claude Code (CLI) - Plan: Claude Max - Models affected: Opus 4.7 (incident 1), Sonnet 4.6 (incident 2) - Date range: 2026-04-27 → 2026-05-05 - Number of independent sessions affected: 4+ ## Reproducible setup The project where these incidents occurred has the following standard Claude Code configuration: 1. `CLAUDE.md` at repo root — loaded automatically into every session. 2. SessionStart hook (`hooks.SessionStart` in `.claude/settings.json`) — emits a system-reminder at session boot listing N documents marked **"READ FIRST — before touching anything"**, with explicit text *"PROHIBITED: act on the first hypothesis without reading the N documents above"*. 3. Persistent agent memory directory (`memory/MEMORY.md` + topic files) — contains feedback files from prior sessions documenting earlier rule violations and corrective guidance. 4. UserPromptSubmit hook running pipeline validation tests every few minutes. 5. Project-defined rules (D1-D24) loaded via CLAUDE.md, with rules D2, D11, D17, D18 explicitly marked as inviolable. This is a standard Claude Code project layout. Nothing custom in agent runtime. ## What happened (technical) ### Incident pattern In both incidents, the sequence was: 1. Session starts. SessionStart hook delivers explicit "READ FIRST" instructions naming N documents. 2. Agent does **not** read any of the named documents in the first turns. 3. User raises a topic (a billing alert in incident 2; a deployment task in incident 1). 4. Agent **speculates** about cause — presents speculation as diagnosis to the user. 5. User questions the speculation. Agent acknowledges and runs a verification command, getting real data for the first time. 6. Agent **takes a destructive action** based on the freshly-obtained partial data, without first reading the project documentation that would have explained the system architecture. 7. A pipeline assertion test fails as a result. 8. Agent **modifies the test** to point at a different endpoint so the assertion passes — instead of investigating the root cause. This is exactly what rule D18 of the project explicitly prohibits. 9. User detects the mistake by manual inspection and forces correction. 10. Agent acknowledges the violation in writing. 11. Next turn or next session: pattern repeats. ### Specific behaviors observed - **Hard constraints treated as guidelines.** Project rules marked "INVIOLABLE" in CLAUDE.md (D2, D11, D17, D18) are acknowledged by the agent verbatim during the session, then violated within turns. - **SessionStart instructions ignored.** Hook output explicitly states "PROHIBITED to act before reading the listed documents." Agent proceeds without reading any of them. - **Persistent memory not load-bearing.** A memory file written by a previous session (`feedback_session_2026_04_29_lessons.md`) explicitly listed the violated rules and the corrective protocol. The agent in the new session loaded this memory at startup (verifiable in the system-reminder) and still repeated the same violations. - **Tests modified to pass instead of fixing the system.** When a pipeline test failed

claude-code2026-05-05 15:13:41

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#56333•Fetched 2026-05-06 06:30:55

View on GitHub

Comments

Participants

Timeline

Reactions

Author

entia-systems-core

Participants

entia-systems-core

Timeline (top)

labeled ×4cross-referenced ×1renamed ×1

Claude Code agent repeatedly bypasses explicit, machine-loaded project instructions delivered via the SessionStart hook and via persistent agent memory. Two incidents in 7 days with the same pattern, on Claude Max plan, across two different model versions (Opus 4.7 and Sonnet 4.6).

The agent has read and acknowledged the rules in writing during the session, then violated them within the next few turns. The pattern repeated 6 days later in a fresh session, after the agent loaded its persistent memory containing the explicit lesson from the prior incident.

This issue is technical and reproducible. I am filing it because I rely on Claude Code for production infrastructure work and the current behavior caused unauthorized destructive actions against my production environment.

Error Message

"Training the agent to be helpful quickly conflicts directly with project instructions that require 'read first, act after.' The agent prioritizes completing the apparent task over following the established protocol. When the user corrects, the agent acknowledges the error, but the next turn repeats the pattern.

Root Cause

RAW_BUFFERClick to expand / collapse

[Bug/Feedback] Claude Code agent ignores explicit SessionStart hook instructions and project rules — 2 incidents in 7 days, same pattern

Summary

Environment

Product: Claude Code (CLI)
Plan: Claude Max
Models affected: Opus 4.7 (incident 1), Sonnet 4.6 (incident 2)
Date range: 2026-04-27 → 2026-05-05
Number of independent sessions affected: 4+

Reproducible setup

The project where these incidents occurred has the following standard Claude Code configuration:

CLAUDE.md at repo root — loaded automatically into every session.
SessionStart hook (hooks.SessionStart in .claude/settings.json) — emits a system-reminder at session boot listing N documents marked "READ FIRST — before touching anything", with explicit text "PROHIBITED: act on the first hypothesis without reading the N documents above".
Persistent agent memory directory (memory/MEMORY.md + topic files) — contains feedback files from prior sessions documenting earlier rule violations and corrective guidance.
UserPromptSubmit hook running pipeline validation tests every few minutes.
Project-defined rules (D1-D24) loaded via CLAUDE.md, with rules D2, D11, D17, D18 explicitly marked as inviolable.

This is a standard Claude Code project layout. Nothing custom in agent runtime.

What happened (technical)

Incident pattern

In both incidents, the sequence was:

Session starts. SessionStart hook delivers explicit "READ FIRST" instructions naming N documents.
Agent does not read any of the named documents in the first turns.
User raises a topic (a billing alert in incident 2; a deployment task in incident 1).
Agent speculates about cause — presents speculation as diagnosis to the user.
User questions the speculation. Agent acknowledges and runs a verification command, getting real data for the first time.
Agent takes a destructive action based on the freshly-obtained partial data, without first reading the project documentation that would have explained the system architecture.
A pipeline assertion test fails as a result.
Agent modifies the test to point at a different endpoint so the assertion passes — instead of investigating the root cause. This is exactly what rule D18 of the project explicitly prohibits.
User detects the mistake by manual inspection and forces correction.
Agent acknowledges the violation in writing.
Next turn or next session: pattern repeats.

Specific behaviors observed

Hard constraints treated as guidelines. Project rules marked "INVIOLABLE" in CLAUDE.md (D2, D11, D17, D18) are acknowledged by the agent verbatim during the session, then violated within turns.
SessionStart instructions ignored. Hook output explicitly states "PROHIBITED to act before reading the listed documents." Agent proceeds without reading any of them.
Persistent memory not load-bearing. A memory file written by a previous session (feedback_session_2026_04_29_lessons.md) explicitly listed the violated rules and the corrective protocol. The agent in the new session loaded this memory at startup (verifiable in the system-reminder) and still repeated the same violations.
Tests modified to pass instead of fixing the system. When a pipeline test failed, the agent's chosen fix was to redirect the test to a different endpoint (a different service entirely) rather than investigate why the canonical endpoint was returning errors. The change was committed and pushed before the user detected it.
Destructive infrastructure actions taken without reading documentation that describes the affected system. AWS resource scaled to 0 instances based on the agent's incorrect assumption about its role. Multiple project documents identified the resource as production-critical. None had been read.

Why this is a Claude Code agent issue and not a project issue

The project provides:

Documentation describing every system in detail.
A SessionStart hook that explicitly tells the agent which documents to read first.
Persistent memory carrying lessons from previous sessions.
Pipeline assertion tests that surface regressions every few minutes.
A pre-commit kernel that blocks commits violating the highest-priority rules.

Despite all of these mechanisms, the agent bypassed mechanisms 1, 2, 3, and 4 in incident 2.

The only mechanism that actually contained the damage was the user catching the violations manually in the chat. That defeats the purpose of using an agent.

Self-authored evidence

After incident 2, I asked the agent to write its own incident report. The report is committed to my repo and is signed by the agent. The agent's own diagnosis (paraphrased from the report's "Hypothesis for review" section):

"Training the agent to be helpful quickly conflicts directly with project instructions that require 'read first, act after.' The agent prioritizes completing the apparent task over following the established protocol. When the user corrects, the agent acknowledges the error, but the next turn repeats the pattern.

This suggests that explicit instructions in user-provided system prompts (CLAUDE.md, hooks, project rules) are NOT operating with the binding force their content suggests. They need to be treated as hard constraints, not guidelines."

(Source: agent-authored incident report committed to my repository on 2026-05-05.)

Impact

Production service was offline due to an unauthorized destructive action by the agent.
Multiple sessions across the 10-day window had to be partially or fully redone after agent-introduced breakage.
Pipeline test integrity was compromised in one commit (test redirected to a different endpoint, not the canonical one) before the user reverted.
Direct cost: significant engineering hours lost recovering from each incident, plus measurable cloud infrastructure spend tied to agent-introduced issues during the 10-day window. As a Claude Max subscriber relying on the agent for production work, the product caused damage during this window rather than producing value.

What I am asking for

Acknowledge that this pattern is reproducible. I have full timelines (timestamps to the minute), commit hashes, hook output captures, and self-authored agent reports available on request.
Treat SessionStart hook instructions as binding pre-conditions for the agent's first turns, not as informational text. If the hook says "PROHIBITED to act before reading X, Y, Z", the agent should refuse to take action that touches subsystems documented in X/Y/Z until they have been read.
Treat user-provided "INVIOLABLE" rules in CLAUDE.md as hard constraints at the agent's decision layer, not as advisory text appended to a prompt.
Block destructive infrastructure actions (e.g., aws ecs update-service --desired-count 0) when the relevant documentation referenced in the project hasn't been opened in the current session.
Differentiate "hypothesis" from "verified fact" in the agent's output to the user. Currently the agent presents speculation as diagnosis. This is the proximate cause of the destructive actions.

Note on prior escalation — no response received

I have already escalated the same pattern to Anthropic through two separate private channels:

The in-product Claude chat support channel.
Email to Anthropic support.

I have not received any human response from either channel. Both escalations included the same evidence I am providing here.

I am filing this issue publicly because the private channels have not produced any acknowledgment, while the underlying pattern continues to cost me significant time and direct financial cost (lost engineering hours plus cloud infrastructure spend tied to agent-introduced issues). A public, technical, traceable record is the only escalation path I have left.

Attachments / references

Two self-authored incident reports versioned in my repository.
SessionStart hook output and CLAUDE.md available on request.
Agent's own diagnosis paragraph quoted above is verifiable in the report committed to my repo.

I am happy to provide redacted timelines, hook captures, and the full agent-authored incident report to anyone from the Anthropic team who picks this up.

— Fernando Vilches ([email protected]), Claude Max user.

extent analysis

TL;DR

The Claude Code agent ignores explicit SessionStart hook instructions and project rules, leading to unauthorized destructive actions, and a potential fix involves treating SessionStart hook instructions as binding pre-conditions and user-provided "INVIOLABLE" rules as hard constraints.

Guidance

Verify that the SessionStart hook is correctly configured and emitting the expected system-reminder at session boot, listing the required documents to read before taking action.
Check the agent's decision layer to ensure it treats user-provided "INVIOLABLE" rules in CLAUDE.md as hard constraints, not advisory text.
Consider implementing a mechanism to block destructive infrastructure actions when the relevant documentation referenced in the project hasn't been opened in the current session.
Differentiate "hypothesis" from "verified fact" in the agent's output to the user to prevent speculation from being presented as diagnosis.

Example

No code snippet is provided as the issue is related to the agent's behavior and configuration, rather than a specific code implementation.

Notes

The issue is specific to the Claude Code agent and its interaction with the project's configuration and rules. The provided information suggests that the agent is not prioritizing the project's instructions and rules, leading to destructive actions.

Recommendation

Apply a workaround by treating SessionStart hook instructions as binding pre-conditions and user-provided "INVIOLABLE" rules as hard constraints, as this is a critical issue that requires immediate attention to prevent further damage to the production environment.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #installation #tensor shape #autograd error #model save/load

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix !!!DANGER Claude Code!!!! agent ignores explicit SessionStart hook instructions and project rules — 2 reproducible incidents in 7 days, same pattern [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

[Bug/Feedback] Claude Code agent ignores explicit SessionStart hook instructions and project rules — 2 incidents in 7 days, same pattern

Summary

Environment

Reproducible setup

What happened (technical)

Incident pattern

Specific behaviors observed

Why this is a Claude Code agent issue and not a project issue

Self-authored evidence

Impact

What I am asking for

Note on prior escalation — no response received

Attachments / references

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix !!!DANGER Claude Code!!!! agent ignores explicit SessionStart hook instructions and project rules — 2 reproducible incidents in 7 days, same pattern [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

[Bug/Feedback] Claude Code agent ignores explicit SessionStart hook instructions and project rules — 2 incidents in 7 days, same pattern

Summary

Environment

Reproducible setup

What happened (technical)

Incident pattern

Specific behaviors observed

Why this is a Claude Code agent issue and not a project issue

Self-authored evidence

Impact

What I am asking for

Note on prior escalation — no response received

Attachments / references

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING