claude-code - 💡(How to fix) Fix [BUG] claude-sonnet-4-6 regression: patch-on-patch reasoning on cascading state, requires explicit prompts to design properly

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Error Messages/Logs

Root Cause

Root cause of my failure: I jumped to mechanical code edits each round instead of reasoning through state-consistency invariants up front. Each "fix" introduced a new edge case that the next round had to patch. The user had to escalate to "design properly" before I would actually think.

Fix Action

Fix / Workaround

Title: Sonnet 4.6 regression: patch-on-patch reasoning on cascading state, requires explicit prompts to design properly

Round 2 — different cascade missed. User: "Select a wallet should also activate its networks." I added the reverse cascade. The user then asked: "Let's keep the deactivated networks showing." I patched availableChains to ignore active state.

Round 3 — user calls it out. Verbatim: "Please design the solution properly as opposed to the fastest code fix." Only at that prompt did I step back and articulate the actual model: two fully independent axes, no cascades, single reset button. The clean solution was 30 lines and removed code from each of my prior patches.

RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Title: Sonnet 4.6 regression: patch-on-patch reasoning on cascading state, requires explicit prompts to design properly

Model: claude-sonnet-4-6 (most of session); switched to claude-opus-4-7 mid-session for refactor pass

Context: ~2-day session on a React + FastAPI multi-chain crypto wallet tracker. Net output strong (5 chains live, NFTs shipped, full-codebase refactor, 322 tests green), but Sonnet showed a specific failure pattern worth flagging.

The pattern — concrete example: NFT filter pills

Feature: a gallery with "wallet" and "chain" filter pills. Independent axes, click to toggle. Should have been ~30 lines.

Took 3 distinct user corrections to land:

Round 1 — first implementation. I built it without thinking through cascade. User: "Deselecting a network should also inactivate a wallet if the wallet doesn't have any of the remaining active networks." I added a cascade in toggleChain.

Round 2 — different cascade missed. User: "Select a wallet should also activate its networks." I added the reverse cascade. The user then asked: "Let's keep the deactivated networks showing." I patched availableChains to ignore active state.

Round 3 — user calls it out. Verbatim: "Please design the solution properly as opposed to the fastest code fix." Only at that prompt did I step back and articulate the actual model: two fully independent axes, no cascades, single reset button. The clean solution was 30 lines and removed code from each of my prior patches.

Root cause of my failure: I jumped to mechanical code edits each round instead of reasoning through state-consistency invariants up front. Each "fix" introduced a new edge case that the next round had to patch. The user had to escalate to "design properly" before I would actually think.

Other observations from the same session:

After Round 2 cascade, user said "I expect you to catch and handle such edge cases" — i.e. shouldn't need user to enumerate them. When asked to add icons to filter rows, I added them to both wallet AND chain rows. User then said "just do it at the wallet level" — I had over-applied without checking the simpler scoping question first. Refactor sweep (when switched to Opus 4.7): one-shot a clean plan, three parallel review agents, batched 8 changes safely, correctly flagged a 9th (shared CoinGecko singleton) as net-negative after tests revealed cache-isolation cost. The contrast in deliberation quality was noticeable. User feedback verbatim (this is what they asked me to file):

"I'm seeing significant regression in model capabilities for Sonnet 4.6 compared to previously, feels like I reluctantly need to use Opus 4.7 more for efficiency"

The "reluctantly" matters: the user wants to use Sonnet (cost/value) but is finding the correction-round overhead makes Opus cheaper end-to-end on non-trivial tasks.

Suggested signal for the team: Sonnet 4.6 appears to default to fastest-path code edits and skip the "reason through the design space" step that Opus reliably does. Users compensate by adding prompt overhead ("design properly", "think about edge cases"); the regression is that this overhead used to not be needed.

What Should Happen?

Sonnet 4.6 appears to default to fastest-path code edits and skip the "reason through the design space" step that Opus reliably does. Users compensate by adding prompt overhead ("design properly", "think about edge cases"); the regression is that this overhead used to not be needed.

Error Messages/Logs

Steps to Reproduce

Repro/context available: The full transcript of this session is at /Users/jasonliu/.claude/projects/-Users-jasonliu-Claude-crate-3g/ (most recent .jsonl). Happy to share.

Claude Model

Sonnet (default)

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

claude-sonnet-4-6

Platform

Anthropic API

Operating System

macOS

Terminal/Shell

VS Code integrated terminal

Additional Information

No response

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING