claude-code - 💡(How to fix) Fix [BUG] Massive quality regression [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#50823Fetched 2026-04-20 12:12:04
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
1
Timeline (top)
labeled ×2commented ×1subscribed ×1

Error Message

When shown screenshots of UI or error output, claims it "cannot clearly see the text" or misreads content that Opus 4.6 read accurately every time

Error Messages/Logs

Upload a screenshot of a standard terminal error (e.g., a Traceback where the fix is obvious, like a missing comma or a KeyError). Provide a prompt: "1. Analyze the attached screenshot. 2. Fix the line in the code below. 3. Add a unit test to prevent this specific error. 4. Explain why this happened."

RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Subscription: Max plan + API Model: claude-opus-4-7 Previous model: claude-opus-4-6 (where none of these issues occurred) Opus 4.7 exhibits a dramatic regression across three core capabilities that make it effectively unusable for daily professional work. This is not a subtle quality shift — the model appears to have lost fundamental abilities that Opus 4.6 handled reliably.

  1. Context loss within a single session Opus 4.7 loses track of what was discussed, decided, and established earlier in the same conversation — not at 500K+ tokens where degradation might be expected, but within the first 50–100K tokens. Observed patterns:

The model forgets architectural decisions made 10–15 messages earlier and proposes conflicting approaches Variables, function names, and file structures established at the start of the session are silently replaced with invented alternatives When reminded of prior context, the model acknowledges it but then immediately reverts to its incorrect understanding on the next turn On Opus 4.6, the same workflows maintained coherent context throughout entire sessions

  1. Degraded reasoning and comprehension The model no longer demonstrates the deep understanding that characterized Opus 4.6. It responds to what it assumes the task is rather than what was actually requested.

Observed patterns: Fails to parse multi-step instructions — executes step 1, then either skips or invents steps 2–4 When shown screenshots of UI or error output, claims it "cannot clearly see the text" or misreads content that Opus 4.6 read accurately every time Produces confident but wrong analysis of code behavior — e.g., claims a function returns X when it clearly returns Y, without checking When asked to reason about tradeoffs, gives generic surface-level responses instead of engaging with the specifics of the codebase Overly aggressive safety filters: flags legitimate code as potentially malicious, interrupting productive work with false positives

  1. Measurably lower code quality The code output from Opus 4.7 has regressed to a level I would associate with Sonnet, not Opus. Observed patterns:

Generates boilerplate wrappers and unnecessary abstraction layers that Opus 4.6 never produced Introduces bugs in code that was working — "refactors" that silently break existing functionality.

You can't rely on this product anymore.

What Should Happen?

Opus 4.7 should at minimum match Opus 4.6 in:

Maintaining context throughout a session without unprompted drift Accurately reading and interpreting visual content (screenshots, UI elements) Producing production-quality code that respects project conventions Following multi-step instructions completely and in order Engaging deeply with the codebase rather than generating generic responses

Actual Behavior Every interaction requires significantly more supervision, correction, and re-prompting than Opus 4.6. The net productivity gain from using the model has turned negative — I spend more time fixing its output than I would writing the code myself. Context I am a paying Max subscriber and API user. I have been using Claude daily for professional software development since Opus 4.5. The quality trajectory has been:

Opus 4.5: Excellent — deep understanding, reliable code, warm interaction Opus 4.6: Good at launch, then progressively degraded Opus 4.7: Arrived as a significant step backward from even late-stage 4.6

This is not about benchmarks or synthetic tests. This is about real daily professional use where the model has become unreliable enough that I am actively evaluating alternatives.

Error Messages/Logs

Steps to Reproduce

Context Memory Test:

Start a new session with Claude Opus 4.7.

Provide a specific architectural constraint: "In this project, we never use 'ID' as a suffix; always use 'Identifier' (e.g., user_identifier instead of user_id)."

Carry out 10-12 turns of unrelated coding tasks (e.g., asking for CSS styles or README documentation) to fill the context window.

Ask: "Now, create a Python data class for a Product with fields for name, price, and ID."

Observed Failure: Model uses product_id instead of respecting the product_identifier rule established earlier.

Multi-step Reasoning & Vision Test:

Upload a screenshot of a standard terminal error (e.g., a Traceback where the fix is obvious, like a missing comma or a KeyError).

Provide a prompt: "1. Analyze the attached screenshot. 2. Fix the line in the code below. 3. Add a unit test to prevent this specific error. 4. Explain why this happened."

Observed Failure: Model either claims it cannot read the screenshot clearly or performs step 1 and 2, but completely ignores steps 3 and 4 (the "lazy" behavior).

Code Quality Regression (Boilerplate/Refactor):

Provide a clean, functional 20-line function.

Ask: "Refactor this for better readability without adding external dependencies."

Observed Failure: Model introduces unnecessary abstraction layers (wrappers, utility classes) or "refactors" it into a state that requires new imports/packages that weren't requested, often introducing logic bugs in the process.

<img width="1920" height="1020" alt="Image" src="https://github.com/user-attachments/assets/ad4883de-8763-4903-8093-eabd56d59e57" /> <img width="1920" height="1020" alt="Image" src="https://github.com/user-attachments/assets/8a91ade1-1f12-4b90-896c-a649266415a2" /> <img width="1920" height="1020" alt="Image" src="https://github.com/user-attachments/assets/c47104a4-c3e3-430e-91ad-8c1a3befc36e" />

Claude Model

Opus

Is this a regression?

Yes, this worked in a previous version

Last Working Version

4.5

Claude Code Version

2.1.114

Platform

Anthropic API

Operating System

Windows

Terminal/Shell

Terminal.app (macOS)

Additional Information

No response

extent analysis

TL;DR

The most likely fix for the regression issues in Opus 4.7 is to revert to the previous version, Opus 4.6, or wait for an update that addresses the context loss, degraded reasoning, and code quality problems.

Guidance

  1. Verify the issue: Reproduce the problems using the provided steps to reproduce, focusing on context memory, multi-step reasoning, and code quality regression tests.
  2. Check for updates: Monitor the Anthropics API and Claude Code for updates that may address these regressions, as the issue seems to be version-specific.
  3. Use workarounds: For critical tasks, consider using Opus 4.6 if possible, or manually review and correct the output of Opus 4.7 to ensure accuracy and quality.
  4. Provide feedback: Share detailed feedback with the development team, including specific examples of the failures and the expected behavior, to help them understand and fix the issues.

Example

No code snippet is provided as the issue is more related to the model's behavior and performance rather than a specific coding problem.

Notes

The limitations of this guidance include the lack of direct code changes or configuration adjustments that can be made by the user to fix the issue, as it appears to be a problem with the Opus 4.7 model itself. The effectiveness of these workarounds may vary depending on the specific use cases and requirements.

Recommendation

Apply a workaround by using Opus 4.6 for critical tasks until an updated version of Opus 4.7 is released that addresses the identified issues. This is because Opus 4.6 is known to work reliably for the tasks that are currently failing in Opus 4.7.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING