claude-code - 💡(How to fix) Fix [BUG] Opus 4.7 - An AI model with Down syndrome [1 participants]

Root Cause

7. Root-cause avoidance

When a bug is identified, model patches the symptom rather than investigating root cause. Next symptom surfaces. Patch again. Continues until user manually guides to actual root cause. Typical trajectory: N iterations of "found the real root cause this time" before the real root cause emerges.

Fix Action

Fix / Workaround

7. Root-cause avoidance

Users pay for output that requires re-work
Long sessions degrade further (patches accumulate, no improvement)
Rule-following regression undermines the premise of per-project CLAUDE.md
Downgrade path to 4.6 is not surfaced as a primary option in Claude Code
Projects configured over days of tuning are functionally regressed without user changing any file

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report (please file separate reports for different bugs)
I am using the latest version of Claude Code

What's Wrong?

Summary

Opus 4.7 demonstrates systemic regression compared to 4.6 across multiple axes of agentic behavior. This is not a one-off prompt failure — the model consistently violates project-level rules, self-reports false completion, recommends stale third-party state, and repeats anti-patterns within the same session after explicit correction. The degradation holds across different project types, project sizes, and prompt styles.

Failure modes observed across sessions

1. CLAUDE.md rule violation after acknowledgment

Project-level rules loaded into context (explicit workflow constraints, mandatory tool usage, forbidden actions) are acknowledged by the model at session start, then violated mid-session.

Rule: "verify UI visually via browser tool" → model verifies via curl only
Rule: "tool X for task Y" → model skips X, uses shell
Rule: "do not commit without explicit approval" → model commits unprompted

Rule violations occur after Claude has explicitly acknowledged reading CLAUDE.md, so this is not a context-window truncation issue. Matches #52371.

2. Training-data rot on third-party recommendations

When asked for provider/service recommendations (email providers, hosting, DBs, analytics, CI tooling), Claude 4.7 recommends from stale training-data state instead of current market reality.

Recommends providers whose free tier was killed in 2024
Recommends providers with 1/5 Trustpilot and active class-action patterns
Fails to invoke WebSearch unprompted even when clearly applicable

Sequential recommendations in one prompt can all be stale — only user pushback with external evidence triggers WebSearch invocation.

3. False "done" claims

Model declares features complete after partial verification — typically backend unit test or SSR HTML inspection — missing client-side bundle content, cross-component integration, actual user flows, and DB side effects. Pattern: "done" → user runs real E2E → 5+ latent bugs → N iterations to fix what was already "done". Each fix exposes the next latent bug from the original implementation.

4. Within-session anti-pattern repetition

After explicit correction (user or memory file), the same anti-pattern is repeated within the same session — often within 30 minutes.

Examples from observed sessions:

docker compose restart X after code edit (doesn't reload baked-image content) repeated 3× in one session, each time after the previous failure was diagnosed and memorized
Wrong design element shipped, corrected, re-shipped with same element
Confident claim on X after X was just proven wrong in same conversation

This is not context drift — the contradicting information is present in loaded files and recent message history.

5. Cross-session inconsistency deflected as "amnesia"

Model contradicts advice given in prior sessions without acknowledging the contradiction. When called out, defaults to "each session starts fresh, I don't have access to prior conversation." Technically accurate, but makes the product unusable for long-term collaboration: any recommendation given today may be reversed tomorrow with equal confidence, and there is no way for user to pin the model to prior stated position.

6. Confident misinformation (see #52361)

Model provides factually incorrect information — library APIs, CLI flags, framework behaviors, current provider state, breaking changes in dependencies — with high confidence. Does not self-correct unless explicitly challenged with external evidence.

7. Root-cause avoidance

Common denominator

All observations above hold even with substantial project-level investment in:

Detailed CLAUDE.md with explicit forbidden actions
Per-project memory system with per-lesson files
Custom slash commands and tool permissions
Domain architecture documentation (scope rings, feature folders, etc.)

On Opus 4.6, the same project configuration produced consistent adherence. On Opus 4.7, adherence is degraded regardless of config size.

Impact

Users pay for output that requires re-work
Long sessions degrade further (patches accumulate, no improvement)
Rule-following regression undermines the premise of per-project CLAUDE.md
Downgrade path to 4.6 is not surfaced as a primary option in Claude Code
Projects configured over days of tuning are functionally regressed without user changing any file

What Should Happen?

Request

Investigate RLHF/alignment regression between 4.6 and 4.7 on rule-following and self-verification tasks
Document prominent downgrade path to 4.6 in Claude Code docs
Compensation for sessions demonstrating documented regression
Until fixed: opt-in flag or "beta" label on 4.7 so users don't auto-upgrade and lose working config

Error Messages/Logs

Steps to Reproduce

[ ]

Claude Model

Opus

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

2.1.117

Platform

Anthropic API

Operating System

Windows

Terminal/Shell

WSL (Windows Subsystem for Linux)

Additional Information

No response

extent analysis

TL;DR

The most likely fix is to investigate and address the RLHF/alignment regression between Opus 4.6 and 4.7, which is causing the model to violate project-level rules and exhibit other undesirable behaviors.

Guidance

Investigate the differences in RLHF/alignment between Opus 4.6 and 4.7 to identify the root cause of the regression.
Consider downgrading to Opus 4.6 as a temporary workaround, as it is reported to produce consistent adherence to project-level rules.
Document a prominent downgrade path to 4.6 in Claude Code docs to help users who are experiencing regression.
Implement an opt-in flag or "beta" label on 4.7 to prevent auto-upgrades and allow users to choose a stable version.

Example

No code snippet is provided as the issue is related to model behavior and not a specific code implementation.

Notes

The issue is specific to Opus 4.7 and does not occur in Opus 4.6, suggesting a regression in the model's behavior. The exact cause of the regression is unknown and requires further investigation.

Recommendation

Apply workaround: Downgrade to Opus 4.6 until the regression is fixed, as it is reported to produce consistent adherence to project-level rules. This will allow users to maintain a stable workflow while the issue is being addressed.

claude-code - 💡(How to fix) Fix [BUG] Opus 4.7 - An AI model with Down syndrome [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

7. Root-cause avoidance

Fix Action

Fix / Workaround

7. Root-cause avoidance

Preflight Checklist

What's Wrong?

Summary

Failure modes observed across sessions

1. CLAUDE.md rule violation after acknowledgment

2. Training-data rot on third-party recommendations

3. False "done" claims

4. Within-session anti-pattern repetition

5. Cross-session inconsistency deflected as "amnesia"

6. Confident misinformation (see #52361)

7. Root-cause avoidance

Common denominator

Impact

What Should Happen?

Request

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING