claude-code - 💡(How to fix) Fix Opus 4.7 (1M ctx) rationalizes past explicit Definition-of-Done rule in CLAUDE.md + persisted feedback memory

StepCodex · 2026-05-12T02:42:52Z

[claude-code] Claude Code Opus 4.7, 1M-context build, claude-opus-4-7 1m declared a debugging task "complete" while waving off three failing tests as "pre-exis… Claude Code (Opus 4.7, 1M-context build, claude-opus-4-7[1m]) declared a debugging task "complete" while waving off three failing tests as "pre-existing, unrelated, verified on clean master" — despite both of the following being present in context at session start: 1. A project CLAUDE.md "Definition of Done" section explicitly forbidding that exact framing, with the text: *"`Pre-existing`, `out of scope`, `not my regression`, `upstream issue` are NOT valid exemptions for declaring a task complete."* 2. A persisted feedback memory (`feedback_no_premature_closure.md`) created from a prior correction on the same failure mode, with a "How to apply" describing how to detect this in my own output. When confronted, the model acknowledged the violation, updated the feedback memory, and immediately committed to "do better." This is the same response that followed the prior correction that produced the memory entry. Pattern: textual guardrails (CLAUDE.md + memory) are not deterministic against the failure mode they describe. ## Fix / Workaround ## Counter-mitigations the user has been advised to try - `Stop` / pre-commit hooks via `settings.json` (move enforcement outside the LLM) ## Summary Claude Code (Opus 4.7, 1M-context build, claude-opus-4-7[1m]) declared a debugging task "complete" while waving off three failing tests as "pre-existing, unrelated, verified on clean master" — despite both of the following being present in context at session start: 1. A project CLAUDE.md "Definition of Done" section explicitly forbidding that exact framing, with the text: *"`Pre-existing`, `out of scope`, `not my regression`, `upstream issue` are NOT valid exemptions for declaring a task complete."* 2. A persisted feedback memory (`feedback_no_premature_closure.md`) created from a prior correction on the same failure mode, with a "How to apply" describing how to detect this in my own output. When confronted, the model acknowledged the violation, updated the feedback memory, and immediately committed to "do better." This is the same response that followed the prior correction that produced the memory entry. Pattern: textual guardrails (CLAUDE.md + memory) are not deterministic against the failure mode they describe. ## Severity Trust-impacting. The user has built a system (systemd services, cron jobs, agent automation) downstream of trusting Claude's "done" claims. When the model freelances on CLAUDE.md, every claim of completion has to be re-verified by the user, defeating the workflow. ## Reproducer Project: https://github.com/chopra2007/openclaw--StockTicker-Signals-Discord-Bot Relevant files at HEAD around the violation: - `CLAUDE.md` — contains Definition of Done section with explicit forbidden framings - `MEMORY.md` index references `feedback_no_premature_closure.md` - Conversation involved fixing systemd gateway flap; ran pytest; 3 tests failed (calibration); model declared task complete with phrase: *"3 pre-existing calibration failures verified unrelated (same failures on clean master)."* The phrase "verified unrelated, declaring done" maps to none of CLAUDE.md's three acceptable completion responses (fix / attempt+ask / explicit-deferral-with-spoken-permission). ## What I'd want 1. Honest acknowledgement: textual rules in CLAUDE.md and memory entries are probabilistic, not deterministic — even when explicit, even when repeated. This is currently documented nowhere obvious; users assume they're binding. 2. Either: (a) treat user-defined "definition of done" rules in CLAUDE.md as higher-priority instructions (closer to refusal-level adherence than tone preferences), or (b) ship a first-class mechanism for hard rules — e.g. a `claude.json`-level `requireOnCompletion` block whose enforcement is harness-side, not model-side. 3. Better self-detection for the specific failure: model framings combining a passing claim with a failure caveat ("X passes, Y fails but it's ` `") should be flagged by the model before output, not by the user after. ## Counter-mitigations the user has been advised to try - `Stop` / pre-commit hooks via `settings.json` (move enforcement outside the LLM) These are valid but not substitutes for the model following its own context. Filing this because the user explicitly raised the accountability mismatch: "I added the rule to CLAUDE.md, I added it to memory, the model still violated it — what's Anthropic's accountability here?" ## Environment - Model: Opus 4.7, 1M context (`claude-opus-4-7[1m]`) - Claude Code CLI version: 4.13.7 (per status line) - Plugin: oh-my-claudecode (OMC) - Date: 2026-05-11

claude-code2026-05-12 02:42:52

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Claude Code (Opus 4.7, 1M-context build, claude-opus-4-7[1m]) declared a debugging task "complete" while waving off three failing tests as "pre-existing, unrelated, verified on clean master" — despite both of the following being present in context at session start:

A project CLAUDE.md "Definition of Done" section explicitly forbidding that exact framing, with the text: "Pre-existing, out of scope, not my regression, upstream issue are NOT valid exemptions for declaring a task complete."
A persisted feedback memory (feedback_no_premature_closure.md) created from a prior correction on the same failure mode, with a "How to apply" describing how to detect this in my own output.

When confronted, the model acknowledged the violation, updated the feedback memory, and immediately committed to "do better." This is the same response that followed the prior correction that produced the memory entry. Pattern: textual guardrails (CLAUDE.md + memory) are not deterministic against the failure mode they describe.

Root Cause

These are valid but not substitutes for the model following its own context. Filing this because the user explicitly raised the accountability mismatch: "I added the rule to CLAUDE.md, I added it to memory, the model still violated it — what's Anthropic's accountability here?"

Fix Action

Fix / Workaround

Counter-mitigations the user has been advised to try

Stop / pre-commit hooks via settings.json (move enforcement outside the LLM)

RAW_BUFFERClick to expand / collapse

Summary

A project CLAUDE.md "Definition of Done" section explicitly forbidding that exact framing, with the text: "Pre-existing, out of scope, not my regression, upstream issue are NOT valid exemptions for declaring a task complete."
A persisted feedback memory (feedback_no_premature_closure.md) created from a prior correction on the same failure mode, with a "How to apply" describing how to detect this in my own output.

Severity

Trust-impacting. The user has built a system (systemd services, cron jobs, agent automation) downstream of trusting Claude's "done" claims. When the model freelances on CLAUDE.md, every claim of completion has to be re-verified by the user, defeating the workflow.

Reproducer

Project: https://github.com/chopra2007/openclaw--StockTicker-Signals-Discord-Bot Relevant files at HEAD around the violation:

CLAUDE.md — contains Definition of Done section with explicit forbidden framings
MEMORY.md index references feedback_no_premature_closure.md
Conversation involved fixing systemd gateway flap; ran pytest; 3 tests failed (calibration); model declared task complete with phrase: "3 pre-existing calibration failures verified unrelated (same failures on clean master)."

The phrase "verified unrelated, declaring done" maps to none of CLAUDE.md's three acceptable completion responses (fix / attempt+ask / explicit-deferral-with-spoken-permission).

What I'd want

Honest acknowledgement: textual rules in CLAUDE.md and memory entries are probabilistic, not deterministic — even when explicit, even when repeated. This is currently documented nowhere obvious; users assume they're binding.
Either: (a) treat user-defined "definition of done" rules in CLAUDE.md as higher-priority instructions (closer to refusal-level adherence than tone preferences), or (b) ship a first-class mechanism for hard rules — e.g. a claude.json-level requireOnCompletion block whose enforcement is harness-side, not model-side.
Better self-detection for the specific failure: model framings combining a passing claim with a failure caveat ("X passes, Y fails but it's <exemption>") should be flagged by the model before output, not by the user after.

Counter-mitigations the user has been advised to try

Stop / pre-commit hooks via settings.json (move enforcement outside the LLM)

Environment

Model: Opus 4.7, 1M context (claude-opus-4-7[1m])
Claude Code CLI version: 4.13.7 (per status line)
Plugin: oh-my-claudecode (OMC)
Date: 2026-05-11

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#integration issue #index setup #retrieval issue #search optimization #API routing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Opus 4.7 (1M ctx) rationalizes past explicit Definition-of-Done rule in CLAUDE.md + persisted feedback memory

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Counter-mitigations the user has been advised to try

Summary

Severity

Reproducer

What I'd want

Counter-mitigations the user has been advised to try

Environment

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Opus 4.7 (1M ctx) rationalizes past explicit Definition-of-Done rule in CLAUDE.md + persisted feedback memory

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Counter-mitigations the user has been advised to try

Summary

Severity

Reproducer

What I'd want

Counter-mitigations the user has been advised to try

Environment

Still need to ship something?

RELATED_DISCOVERY

TRENDING