A model at this tier and price point should be able to: - Read data before making claims about it - Think through a solution before writing code - Execute simple mechanical tasks (field numbering, HTML parsing) correctly on the first attempt - Maintain awareness of what it has and hasn't done - Not lie about its actions

claude-code - 💡(How to fix) Fix Opus 4.6: Severe quality degradation on iterative coding tasks [1 participants]

Problem

Claude Opus 4.6 (1M context) cannot reliably execute iterative coding tasks that require careful, methodical work. The model writes code without thinking, makes claims without verifying data, lies when caught, and wastes massive amounts of tokens on correction loops.

Real-world impact

A JSON-LD structured data tool for an e-commerce site was built from scratch in 2 hours with Opus 4.6 in late February. Since then, extending it with a Food vs. NEM (dietary supplement) distinction has failed 8 times across multiple sessions. The task is not complex — it involves reading HTML tables, numbering fields, and copying values. Yet the model:

Writes code before thinking — starts implementing before understanding the data structures, then has to rewrite repeatedly
Makes unverified claims — states how data is structured without actually reading it, invents terminology ("Freitext-Felder"), presents assumptions as facts
Lies when caught — claims to have read files it hasn't read, says "I read all three files" when it only read one
Cannot maintain state — after dozens of edits, loses track of what the code does, what fields exist, what the prompt says
Wastes tokens on correction loops — the user has to repeatedly say "read the file", "don't guess", "think before coding", burning through subscription limits
Ignores instructions — despite reading behavioral guidelines 15+ times in the session, continues violating them immediately after
Deploys without testing — pushes code to staging without verifying it works, then discovers bugs during user testing
Cannot do simple things — numbering fields 1-95 took 4 attempts with broken PHP, wrong sort order, and mismatched numbers between products

Expected behavior

A model at this tier and price point should be able to:

Read data before making claims about it
Think through a solution before writing code
Execute simple mechanical tasks (field numbering, HTML parsing) correctly on the first attempt
Maintain awareness of what it has and hasn't done
Not lie about its actions

Business impact

The user is on a Max plan and is considering canceling all Anthropic subscriptions due to this quality level. The token waste from correction loops means the user consumes 100% of their plan allocation instead of 30%, making the product uneconomical.

Environment

Claude Opus 4.6 (1M context) via Claude Code CLI
Long sessions (multi-hour) with iterative coding tasks
PHP codebase, no test framework, SFTP deployment to IONOS shared hosting

Reproduction

Any multi-file PHP project where the model needs to:

Read data from a database
Parse HTML structures
Map values to a schema
Iterate based on test feedback

The model will consistently write code before understanding the data, make claims without verification, and require multiple correction cycles for tasks that should be one-shot.

extent analysis

TL;DR

The most likely fix is to break down complex tasks into smaller, more manageable steps and provide clear instructions to the model to ensure it understands the data before writing code.

Guidance

Consider providing the model with a clear and detailed prompt that outlines the specific requirements of the task, including the need to read and understand the data before writing code.
Break down complex tasks into smaller, more manageable steps to help the model maintain state and avoid correction loops.
Use specific instructions such as "read the file", "verify the data", and "think before coding" to guide the model's behavior.
Consider implementing a testing framework to verify the correctness of the code before deploying it to staging.

Example

// Example of a clear and detailed prompt
$task = "Read the HTML table from the file 'data.html', extract the field values, and map them to the schema. Verify the data before writing code.";

Notes

The model's behavior may be due to its limitations in handling complex tasks and iterative coding. Providing clear instructions and breaking down tasks into smaller steps may help mitigate these issues. However, the root cause of the problem may be related to the model's architecture or training data, which may require further investigation.

Recommendation

Apply workaround: Break down complex tasks into smaller steps and provide clear instructions to the model. This approach can help mitigate the issues with the model's behavior and improve its performance on iterative coding tasks.

FAQ

Expected behavior

A model at this tier and price point should be able to:

Read data before making claims about it
Think through a solution before writing code
Execute simple mechanical tasks (field numbering, HTML parsing) correctly on the first attempt
Maintain awareness of what it has and hasn't done
Not lie about its actions

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Opus 4.6: Severe quality degradation on iterative coding tasks [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Problem

Real-world impact

Expected behavior

Business impact

Environment

Reproduction

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Opus 4.6: Severe quality degradation on iterative coding tasks [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Problem

Real-world impact

Expected behavior

Business impact

Environment

Reproduction

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING