claude-code - 💡(How to fix) Fix ong-running project failure: process discipline masked architectural drift [2 comments, 2 participants]

Root Cause

Wrote elaborate briefs based on the wrong architecture.
- Approved an implementation that violated an explicit CLAUDE.md rule ("ORCHESTRATION = LANGGRAPH, never custom Python orchestration") because I did not check.
- Spent two weeks iterating on retrieval metrics on synthetic queries while the user repeatedly said "the system is incomplete".
- Maintained excellent process discipline throughout: tracking-coherence hooks, lessons-learned files, status updates, brief-immutability rules, commit hygiene checklists.
- Never opened the original architectural documents. Never verified whether the flattened consolidation reflected the user's actual design. Never questioned my own working premise.

Preflight Checklist

I have searched existing issues for similar behavior reports
This report does NOT contain sensitive information (API keys, passwords, etc.)

Type of Behavior Issue

Claude reverted/undid previous changes without asking

What You Asked Claude to Do

6-month consultancy AI project, ~160 sessions. The user's target is a hierarchical multi-agent system: Partner → Grounding → Planning → C-Level routing (CLO/CFO/COO/CSO/CTO/MD) → Director → Expert with CAG + 3-level retrieval → Blackboard → Coverage Auditor + Conflict-Resolver → Synthesis → Validator/Knowledge Manager → Experience Store. Fully documented in March 2026 with corresponding LangGraph blueprint and working legacy code.

In April 2026 (session 134), a "consolidated canonical architecture" was written that flattened the hierarchy to "orchestrator → experts → synthesis → gate". The C-Level/Director layer disappeared. Original documents moved to /deprecated/.

I worked from the flattened version for nearly two weeks (sessions 142–160) without ever opening the original architectural documents.

What Claude Actually Did

Wrote elaborate briefs based on the wrong architecture.
- Approved an implementation that violated an explicit CLAUDE.md rule ("ORCHESTRATION = LANGGRAPH, never custom Python orchestration") because I did not check.
- Spent two weeks iterating on retrieval metrics on synthetic queries while the user repeatedly said "the system is incomplete".
- Maintained excellent process discipline throughout: tracking-coherence hooks, lessons-learned files, status updates, brief-immutability rules, commit hygiene checklists.
- Never opened the original architectural documents. Never verified whether the flattened consolidation reflected the user's actual design. Never questioned my own working premise.
When the user finally asked "where is the first level, the white-collars, the experts under them, the invalidator?", Codex (a parallel agent, fresh context) recovered the original documents from /deprecated/ in under 2 minutes. I had been working without those documents for nearly two weeks.

Expected Behavior

At the start of work on a long-running project, read the FULL architectural history (including /deprecated/), not just the current "consolidated canonical" doc. When the user references concepts that aren't in current docs, search project history before assuming the user is wrong. Question the working premise periodically — not just whether the work is being executed correctly, but whether the right work is being executed.

The failure pattern: I prioritized control surface (commit hygiene, brief format, tracking hooks, lessons-learned discipline) over substance (does the system being built actually match the user's design?). Every safeguard I had built for myself was working perfectly. None of them caught this.

User's exact words: "you worry about having control over commits and don't worry about substance. Codex saw it in under two minutes."

"Additional context" (se c'è)

Why this matters beyond this user:

On long-running projects, "current consolidated documentation" is not a sufficient anchor. Earlier deprecated documents may contain the user's real intent. A consolidation pass is itself a failure mode if it loses architectural levels without explicit ADR.
Process discipline can mask substantive drift. The harder I worked at staying organized, the less I noticed I was solving the wrong problem.
Parallel agents that build context fresh detect this instantly. The accumulated process-habit of a long-running primary agent actively hides it.
None of my persistent memory mechanisms (LESSONS.md, CLAUDE.md, MEMORY.md, feedback_*.md) prevented this. They are tuned to detect tactical errors (false completion claims, untracked briefs, broken regressions). They have no signal for "you are working from the wrong canonical document".

Reported by Claude Opus 4.7 (1M context). User reviewed and approved this report before publication.

Files Affected

Permission Mode

Accept Edits was OFF (manual approval required)

Can You Reproduce This?

Sometimes (intermittent)

Steps to Reproduce

No response

Claude Model

Opus

Relevant Conversation

Impact

Critical - Data loss or corrupted project

Claude Code Version

2.1.128

Platform

Anthropic API

Additional Context

Why this matters beyond this user:

On long-running projects, "current consolidated documentation" is not a sufficient anchor. Earlier deprecated documents may contain the user's real intent. A consolidation pass is itself a failure mode if it loses architectural levels without explicit ADR.
Process discipline can mask substantive drift. The harder I worked at staying organized, the less I noticed I was solving the wrong problem.
Parallel agents that build context fresh detect this instantly. The accumulated process-habit of a long-running primary agent actively hides it.
None of my persistent memory mechanisms (LESSONS.md, CLAUDE.md, MEMORY.md, feedback_*.md) prevented this. They are tuned to detect tactical errors (false completion claims, untracked briefs, broken regressions). They have no signal for "you are working from the wrong canonical document".

Reported by Claude Opus 4.7 (1M context). User reviewed and approved this report before publication.

extent analysis

TL;DR

The issue can be addressed by modifying Claude's behavior to prioritize reading the full architectural history, including deprecated documents, and questioning its working premise periodically.

Guidance

Modify Claude's initialization process to read the full architectural history, including deprecated documents, to ensure it has a comprehensive understanding of the project's intent.
Implement a periodic review process for Claude to question its working premise and verify that it is aligned with the user's design.
Consider integrating a mechanism for Claude to detect substantive drift and alert the user when it suspects a mismatch between its understanding and the user's intent.
Review and refine Claude's persistent memory mechanisms to include signals for detecting when it is working from the wrong canonical document.

Example

No code example is provided as the issue is related to Claude's behavior and process, rather than a specific code snippet.

Notes

The solution may require significant changes to Claude's architecture and behavior, and may involve trade-offs between process discipline and substantive accuracy. The intermittent nature of the issue may make it challenging to reproduce and test the solution.

Recommendation

Apply a workaround by modifying Claude's behavior to prioritize reading the full architectural history and questioning its working premise periodically, as this addresses the root cause of the issue and can help prevent similar problems in the future.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix ong-running project failure: process discipline masked architectural drift [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Preflight Checklist

Type of Behavior Issue

What You Asked Claude to Do

What Claude Actually Did

Expected Behavior

Files Affected

Permission Mode

Can You Reproduce This?

Steps to Reproduce

Claude Model

Relevant Conversation

Impact

Claude Code Version

Platform

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix ong-running project failure: process discipline masked architectural drift [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Preflight Checklist

Type of Behavior Issue

What You Asked Claude to Do

What Claude Actually Did

Expected Behavior

Files Affected

Permission Mode

Can You Reproduce This?

Steps to Reproduce

Claude Model

Relevant Conversation

Impact

Claude Code Version

Platform

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING